gpt4 book ai didi

python - 谷歌应用引擎 : How to write large files to Google Cloud Storage

转载 作者:太空狗 更新时间:2023-10-30 01:23:55 24 4
gpt4 key购买 nike

我正在尝试将大型文件从 Google App Engine 的 Blobstore 保存到 Google Cloud Storage 以方便备份。

它适用于小文件 (<10 mb),但对于较大的文件,它会变得不稳定,GAE 会抛出 FileNotOpenedError。

我的代码:

PATH = '/gs/backupbucket/'
for df in DocumentFile.all():
fn = df.blob.filename
br = blobstore.BlobReader(df.blob)
write_path = files.gs.create(self.PATH+fn.encode('utf-8'), mime_type='application/zip',acl='project-private')
with files.open(write_path, 'a') as fp:
while True:
buf = br.read(100000)
if buf=="": break
fp.write(buf)
files.finalize(write_path)

(在 taskeque 中运行以避免超出执行时间)。

抛出 FileNotOpenedError:

Traceback (most recent call last):  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__    rv = self.handle_exception(request, response, e)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__    rv = self.router.dispatch(request, response)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher    return route.handler_adapter(request, response)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__    return handler.dispatch()  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch    return self.handle_exception(e, self.app.debug)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch    return method(*args, **kwargs)  File "/base/data/home/apps/s~simplerepository/1.354754771592783168/processFiles.py", line 249, in post    fp.write(buf)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 281, in __exit__    self.close()  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 275, in close    self._make_rpc_call_with_retry('Close', request, response)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 388, in _make_rpc_call_with_retry    _make_call(method, request, response)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 236, in _make_call    _raise_app_error(e)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 179, in _raise_app_error    raise FileNotOpenedError()

I have investigated further and according to a comment to GAE Issue 5371 the Files API closes the file every 30 seconds. I have not seen this documented anywhere else.

I have tried to work around this by closing and opening the file at intervals but now I get an WrongOpenModeError. The code below is edited from the first version of this post I have added a 0.5 second pause between the close and the open of the file. It now throws a WrongOpenModeError.

My code (updated):

PATH = '/gs/backupbucket/'
for df in DocumentFile.all():
fn = df.blob.filename
br = blobstore.BlobReader(df.blob)
write_path = files.gs.create(self.PATH+fn.encode('utf-8'), mime_type='application/zip',acl='project-private')
fp = files.open(write_path, 'a')
c = 0
while True:
if (c == 5):
c = 0
fp.close()
files.finalize(write_path)
time.sleep(0.5)
fp = files.open(write_path, 'a')
c = c + 1
buf = br.read(100000)
if buf=="": break
fp.write(buf)
files.finalize(write_path)

堆栈跟踪:

Traceback (most recent call last):  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__    rv = self.handle_exception(request, response, e)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__    rv = self.router.dispatch(request, response)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher    return route.handler_adapter(request, response)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__    return handler.dispatch()  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch    return self.handle_exception(e, self.app.debug)  File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch    return method(*args, **kwargs)  File "/base/data/home/apps/s~simplerepository/1.354894420907462278/processFiles.py", line 267, in get    fp.write(buf)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 310, in write    self._make_rpc_call_with_retry('Append', request, response)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 388, in _make_rpc_call_with_retry    _make_call(method, request, response)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 236, in _make_call    _raise_app_error(e)  File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/files/file.py", line 188, in _raise_app_error    raise WrongOpenModeError()

我试图找到有关 WrongOpenModeError 的信息,但唯一提到它的地方是在 appengine.api.files.file.py 本身。

非常感谢有关如何解决此问题并能够将大文件保存到 Google 云存储的建议。谢谢!

最佳答案

IMO 你应该在时间间隔内 files.finalize(write_path),finalize 使文件可读并且你不能再次将其更改为可写。

关于python - 谷歌应用引擎 : How to write large files to Google Cloud Storage,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8201283/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com