作者热门文章
- mongodb - 在 MongoDB mapreduce 中,如何展平值对象?
- javascript - 对象传播与 Object.assign
- html - 输入类型 ="submit"Vs 按钮标签它们可以互换吗?
- sql - 使用 MongoDB 而不是 MS SQL Server 的优缺点
我想使用 urllib 下载一个文件,并在保存前解压内存中的文件。
这就是我现在拥有的:
response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())
decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')
outfile = open(outFilePath, 'w')
outfile.write(decompressedFile.read())
这最终会写入空文件。我怎样才能达到我所追求的目标?
更新答案:
#! /usr/bin/env python2
import urllib2
import StringIO
import gzip
baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/"
# check filename: it may change over time, due to new updates
filename = "man-pages-5.00.tar.gz"
outFilePath = filename[:-3]
response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO(response.read())
decompressedFile = gzip.GzipFile(fileobj=compressedFile)
with open(outFilePath, 'w') as outfile:
outfile.write(decompressedFile.read())
最佳答案
您需要在写入 compressedFile
之后但在将其传递给 gzip.GzipFile()
之前寻找它的开头。否则它将被 gzip
模块从末尾读取,并显示为一个空文件。见下文:
#! /usr/bin/env python
import urllib2
import StringIO
import gzip
baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/"
filename = "man-pages-3.34.tar.gz"
outFilePath = "man-pages-3.34.tar"
response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())
#
# Set the file's current position to the beginning
# of the file so that gzip.GzipFile can read
# its contents from the top.
#
compressedFile.seek(0)
decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')
with open(outFilePath, 'w') as outfile:
outfile.write(decompressedFile.read())
关于python - 下载并解压内存中的gzip压缩文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15352668/
我是一名优秀的程序员,十分优秀!