gpt4 book ai didi

python - 这是这个 gzip inflate 方法中的错误吗?

转载 作者:太空狗 更新时间:2023-10-29 21:37:34 30 4
gpt4 key购买 nike

在搜索如何在 iOS 上对 gzip 压缩数据进行 inflate 时,以下方法出现在结果数中:

- (NSData *)gzipInflate
{
if ([self length] == 0) return self;

unsigned full_length = [self length];
unsigned half_length = [self length] / 2;

NSMutableData *decompressed = [NSMutableData dataWithLength: full_length + half_length];
BOOL done = NO;
int status;

z_stream strm;
strm.next_in = (Bytef *)[self bytes];
strm.avail_in = [self length];
strm.total_out = 0;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;

if (inflateInit2(&strm, (15+32)) != Z_OK) return nil;
while (!done)
{
// Make sure we have enough room and reset the lengths.
if (strm.total_out >= [decompressed length])
[decompressed increaseLengthBy: half_length];
strm.next_out = [decompressed mutableBytes] + strm.total_out;
strm.avail_out = [decompressed length] - strm.total_out;

// Inflate another chunk.
status = inflate (&strm, Z_SYNC_FLUSH);
if (status == Z_STREAM_END) done = YES;
else if (status != Z_OK) break;
}
if (inflateEnd (&strm) != Z_OK) return nil;

// Set real length.
if (done)
{
[decompressed setLength: strm.total_out];
return [NSData dataWithData: decompressed];
}
else return nil;
}

但我遇到过一些数据示例(在带有 Python 的 gzip module 的 Linux 机器上缩小)表明此方法在 iOS 上运行时无法扩充。这是正在发生的事情:

在 while 循环的最后一次迭代中,inflate() 返回 Z_BUF_ERROR 并退出循环。但是在循环之后调用的 inflateEnd() 返回 Z_OK。然后代码假定由于 inflate() 从未返回 Z_STREAM_END,因此膨胀失败并返回 null。

根据此页面,http://www.zlib.net/zlib_faq.html#faq05 Z_BUF_ERROR 不是 fatal error ,我的有限示例测试表明,如果 inflateEnd() 返回 Z_OK,则数据已成功膨胀,即使最后一次调用 inflate() 没有返回 Z_OK。看起来 inflateEnd() 完成了对最后一 block 数据的膨胀。

我对压缩和 gzip 的工作原理知之甚少,所以在不完全理解它的作用之前,我犹豫要不要更改这段代码。我希望对这个主题有更多了解的人可以阐明上面代码中这个潜在的逻辑缺陷,并提出修复它的方法。

谷歌出现的另一种方法,似乎遇到同样的问题,可以在这里找到:https://github.com/nicklockwood/GZIP/blob/master/GZIP/NSData%2BGZIP.m

编辑:

所以,这是一个错误!现在,我们如何解决它?以下是我的尝试。代码审查,有人吗?

- (NSData *)gzipInflate
{
if ([self length] == 0) return self;

unsigned full_length = [self length];
unsigned half_length = [self length] / 2;

NSMutableData *decompressed = [NSMutableData dataWithLength: full_length + half_length];
int status;

z_stream strm;
strm.next_in = (Bytef *)[self bytes];
strm.avail_in = [self length];
strm.total_out = 0;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;

if (inflateInit2(&strm, (15+32)) != Z_OK) return nil;

do
{
// Make sure we have enough room and reset the lengths.
if (strm.total_out >= [decompressed length])
[decompressed increaseLengthBy: half_length];
strm.next_out = [decompressed mutableBytes] + strm.total_out;
strm.avail_out = [decompressed length] - strm.total_out;

// Inflate another chunk.
status = inflate (&strm, Z_SYNC_FLUSH);

switch (status) {
case Z_NEED_DICT:
status = Z_DATA_ERROR; /* and fall through */
case Z_DATA_ERROR:
case Z_MEM_ERROR:
case Z_STREAM_ERROR:
(void)inflateEnd(&strm);
return nil;
}
} while (status != Z_STREAM_END);

(void)inflateEnd (&strm);

// Set real length.
if (status == Z_STREAM_END)
{
[decompressed setLength: strm.total_out];
return [NSData dataWithData: decompressed];
}
else return nil;
}

编辑 2:

这是一个示例 Xcode 项目,它说明了我遇到的问题。压缩发生在服务器端,数据在通过 HTTP 传输之前经过 base64 和 url 编码。我在 ViewController.m 中嵌入了 url 编码的 base64 字符串。 url-decode 和 base64-decode 以及你的 gzipInflate 方法都在 NSDataExtension.m 中

https://dl.dropboxusercontent.com/u/38893107/gzip/GZIPTEST.zip

这是由 python gzip 库压缩后的二进制文件:

https://dl.dropboxusercontent.com/u/38893107/gzip/binary.zip

这是通过 HTTP 传输的 URL 编码的 base64 字符串: https://dl.dropboxusercontent.com/u/38893107/gzip/urlEncodedBase64.txt

最佳答案

是的,这是一个错误。

如果 inflate() 没有返回 Z_STREAM_END 实际上是正确的,那么你还没有完成膨胀。 inflateEnd() 返回 Z_OK 并没有多大意义——只是它被赋予了一个有效状态并且能够释放内存。

因此 inflate() 必须最终返回 Z_STREAM_END 才能声明成功。然而 Z_BUF_ERROR 并不是放弃的理由。在这种情况下,您只需使用更多输入或更多输出空间再次调用 inflate()。然后您将获得 Z_STREAM_END

来自 zlib.h 中的文档:

/* ...
Z_BUF_ERROR if no progress is possible or if there was not enough room in the
output buffer when Z_FINISH is used. Note that Z_BUF_ERROR is not fatal, and
inflate() can be called again with more input and more output space to
continue decompressing.
... */

更新:

由于到处都是错误代码,下面是实现所需方法的正确代码。此代码处理不完整的 gzip 流、连接的 gzip 流和非常大的 gzip 流。对于非常大的 gzip 流,z_stream 中的 unsigned 长度在编译为 64 位可执行文件时不够大。 NSUInteger 是 64 位,而 unsigned 是 32 位。在这种情况下,您必须循环输入以将其提供给 inflate()

此示例在任何错误时简单地返回 nil。错误的性质在每次 return nil; 之后的注释中注明,以防需要更复杂的错误处理。

- (NSData *) gzipInflate
{
z_stream strm;

// Initialize input
strm.next_in = (Bytef *)[self bytes];
NSUInteger left = [self length]; // input left to decompress
if (left == 0)
return nil; // incomplete gzip stream

// Create starting space for output (guess double the input size, will grow
// if needed -- in an extreme case, could end up needing more than 1000
// times the input size)
NSUInteger space = left << 1;
if (space < left)
space = NSUIntegerMax;
NSMutableData *decompressed = [NSMutableData dataWithLength: space];
space = [decompressed length];

// Initialize output
strm.next_out = (Bytef *)[decompressed mutableBytes];
NSUInteger have = 0; // output generated so far

// Set up for gzip decoding
strm.avail_in = 0;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
int status = inflateInit2(&strm, (15+16));
if (status != Z_OK)
return nil; // out of memory

// Decompress all of self
do {
// Allow for concatenated gzip streams (per RFC 1952)
if (status == Z_STREAM_END)
(void)inflateReset(&strm);

// Provide input for inflate
if (strm.avail_in == 0) {
strm.avail_in = left > UINT_MAX ? UINT_MAX : (unsigned)left;
left -= strm.avail_in;
}

// Decompress the available input
do {
// Allocate more output space if none left
if (space == have) {
// Double space, handle overflow
space <<= 1;
if (space < have) {
space = NSUIntegerMax;
if (space == have) {
// space was already maxed out!
(void)inflateEnd(&strm);
return nil; // output exceeds integer size
}
}

// Increase space
[decompressed setLength: space];
space = [decompressed length];

// Update output pointer (might have moved)
strm.next_out = (Bytef *)[decompressed mutableBytes] + have;
}

// Provide output space for inflate
strm.avail_out = space - have > UINT_MAX ? UINT_MAX :
(unsigned)(space - have);
have += strm.avail_out;

// Inflate and update the decompressed size
status = inflate (&strm, Z_SYNC_FLUSH);
have -= strm.avail_out;

// Bail out if any errors
if (status != Z_OK && status != Z_BUF_ERROR &&
status != Z_STREAM_END) {
(void)inflateEnd(&strm);
return nil; // invalid gzip stream
}

// Repeat until all output is generated from provided input (note
// that even if strm.avail_in is zero, there may still be pending
// output -- we're not done until the output buffer isn't filled)
} while (strm.avail_out == 0);

// Continue until all input consumed
} while (left || strm.avail_in);

// Free the memory allocated by inflateInit2()
(void)inflateEnd(&strm);

// Verify that the input is a valid gzip stream
if (status != Z_STREAM_END)
return nil; // incomplete gzip stream

// Set the actual length and return the decompressed data
[decompressed setLength: have];
return decompressed;
}

关于python - 这是这个 gzip inflate 方法中的错误吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17820664/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com