gpt4 book ai didi

python - 使用 zipfile.ZipFile 即时打开 urllib2.urlopen() 的响应

转载 作者:太空宇宙 更新时间:2023-11-03 18:23:55 25 4
gpt4 key购买 nike

似乎 zipfile.ZipFile 需要随机访问,而 urllib2 返回的“类似文件”对象不支持该随机访问。

我尝试用 io.BufferedRandom 包装它,但得到:

AttributeError: addinfourl instance has no attribute 'seekable'

最佳答案

在没有其他回复的情况下,我采用了下面的自制解决方案。读取 zip 文件时,它可能不会减少内存占用空间,但在首先读取 zip header 时,它可能会改善延迟。

from io import BytesIO, SEEK_SET, SEEK_END

def _ceil_div(a, b):
return (a + b - 1) / b

def _align_up(a, b):
return _ceil_div(a, b) * b

class BufferedRandomReader:
"""Create random-access, read-only buffered stream adapter from a sequential
input stream which does not support random access (i.e., ```seek()```)

Example::

>>> stream = BufferedRandomReader(BytesIO('abc'))
>>> print stream.read(2)
ab
>>> stream.seek(0)
0L
>>> print stream.read()
abc

"""

def __init__(self, fin, chunk_size=512):
self._fin = fin
self._buf = BytesIO()
self._eof = False
self._chunk_size = chunk_size

def tell(self):
return self._buf.tell()

def read(self, n=-1):
"""Read at most ``n`` bytes from the file (less if the ```read``` hits
end-of-file before obtaining size bytes).

If ``n`` argument is negative or omitted, read all data until end of
file is reached. The bytes are returned as a string object. An empty
string is returned when end of file is encountered immediately.
"""
pos = self._buf.tell()
end = self._buf.seek(0, SEEK_END)

if n < 0:
if not self._eof:
self._buf.write(self._fin.read())
self._eof = True
else:
req = pos + n - end

if req > 0 and not self._eof: # need to grow
bcount = _align_up(req, self._chunk_size)
bytes = self._fin.read(bcount)

self._buf.write(bytes)
self._eof = len(bytes) < bcount

self._buf.seek(pos)

return self._buf.read(n)

def seek(self, offset, whence=SEEK_SET):

if whence == SEEK_END:
if not self._eof:
self._buf.seek(0, SEEK_END)
self._buf.write(self._fin.read())
self._eof = True
return self._buf.seek(offset, SEEK_END)

return self._buf.seek(offset, whence)

def close(self):
self._fin.close()
self._buf.close()

使用示例:

import urllib2
req = urllib2.urlopen('http://test/file.zip')

import zipfile
zf = zipfile.ZipFile(BufferedRandomReader(req), 'r')

...

关于python - 使用 zipfile.ZipFile 即时打开 urllib2.urlopen() 的响应,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23579088/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com