gpt4 book ai didi

python - Boto S3 偶尔会抛出 httplib.IncompleteRead

转载 作者:太空狗 更新时间:2023-10-29 20:26:05 26 4
gpt4 key购买 nike

我有几个守护进程使用 boto 从 Amazon S3 读取许多文件。每隔几天,我就会遇到一种情况,即 httplib.IncompleteRead 从 boto 的深处被抛出。如果我尝试重试该请求,它会立即因另一个 IncompleteRead 而失败。即使我调用 bucket.connection.close(),所有进一步的请求仍然会出错。

我觉得我可能在这里偶然发现了 boto 中的一个错误,但似乎没有其他人遇到过它。难道我做错了什么?所有守护进程都是单线程的,我尝试过两种方式设置 is_secure

Traceback (most recent call last):
...
File "<file_wrapper.py",> line 22, in next
line = self.readline()
File "<file_wrapper.py",> line 37, in readline
data = self.fh.read(self.buffer_size)
File "<virtualenv/lib/python2.6/site-packages/boto/s3/key.py",> line 378, in read
self.close()
File "<virtualenv/lib/python2.6/site-packages/boto/s3/key.py",> line 349, in close
self.resp.read()
File "<virtualenv/lib/python2.6/site-packages/boto/connection.py",> line 411, in read
self._cached_response = httplib.HTTPResponse.read(self)
File "/usr/lib/python2.6/httplib.py", line 529, in read
s = self._safe_read(self.length)
File "/usr/lib/python2.6/httplib.py", line 621, in _safe_read
raise IncompleteRead(''.join(s), amt)

环境:

  • 亚马逊 EC2
  • Ubuntu 11.10
  • python 2.6.7
  • 博托 2.12.0

最佳答案

一段时间以来,我一直在为这个问题苦苦挣扎,运行从 S3 读取大量数据的长时间运行的进程。我决定在这里发布我的解决方案,以供后代使用。

首先,我确信@Glenn 指出的 hack 有效,但我选择不使用它,因为我认为它具有侵入性(hacking httplib)和不安全(它盲目地返回它得到的东西,即 返回 e.partial,尽管它可能是真正的错误案例)。

这是我最终想出的解决方案,似乎有效。

我正在使用这个通用的重试函数:

import time, logging, httplib, socket

def run_with_retries(func, num_retries, sleep = None, exception_types = Exception, on_retry = None):
for i in range(num_retries):
try:
return func() # call the function
except exception_types, e:
# failed on the known exception
if i == num_retries - 1:
raise # this was the last attempt. reraise
logging.warning(f'operation {func} failed with error {e}. will retry {num_retries-i-1} more times')
if on_retry is not None:
on_retry()
if sleep is not None:
time.sleep(sleep)
assert 0 # should not reach this point

现在,当从 S3 读取文件时,我正在使用这个函数,它会在出现 IncompleteRead 错误时在内部执行重试。出现错误时,在重试之前,我调用 key.close()

def read_s3_file(key):
"""
Reads the entire contents of a file on S3.
@param key: a boto.s3.key.Key instance
"""
return run_with_retries(
key.read, num_retries = 3, sleep = 0.5,
exception_types = (httplib.IncompleteRead, socket.error),
# close the connection before retrying
on_retry = lambda: key.close()
)

关于python - Boto S3 偶尔会抛出 httplib.IncompleteRead,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19373368/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com