gpt4 book ai didi

block - Bzip2 block 头 : 1AY&SY

转载 作者:行者123 更新时间:2023-12-02 01:58:55 26 4
gpt4 key购买 nike

这是关于bzip2的问题archive format .任何 Bzip2 存档都由文件头、一个或多个块和尾结构组成。所有块都应以“1AY&SY”开头,Pi 编号的 6 字节 BCD 编码数字,0x314159265359。根据 the source of bzip2 :

/*--
A 6-byte block header, the value chosen arbitrarily
as 0x314159265359 :-). A 32 bit value does not really
give a strong enough guarantee that the value will not
appear by chance in the compressed datastream. Worst-case
probability of this event, for a 900k block, is about
2.0e-3 for 32 bits, 1.0e-5 for 40 bits and 4.0e-8 for 48 bits.
For a compressed file of size 100Gb -- about 100000 blocks --
only a 48-bit marker will do. NB: normal compression/
decompression do *not* rely on these statistical properties.
They are only important when trying to recover blocks from
damaged files.
--*/

问题是:是不是所有的 bzip2 文件都会有块的开头与字节边界对齐?我的意思是所有由 bzip2 的引用实现创建的文件,bzip2-1.0.5+ 实用程序。

我认为 bzip2 可能不会将流解析为字节流,而是作为位流(块本身是由 huffman 编码的,它不是按设计字节对齐的)。

所以,换句话说:如果 grep -c 1AY&SY更大(霍夫曼可能会在块内生成 1AY&SY)或等于文件中 bzip2 块的数量?

最佳答案

BZIP2 查看比特流。

来自 http://blastedbio.blogspot.com/2011/11/random-access-to-bzip2.html :

Anyway, the important bits are that a BZIP2 file contains one or more "streams", which are byte aligned, each containing one (zero?) or more "blocks", which are not byte aligned, followed by an end of stream marker (the six bytes 0x177245385090 which is the square root of pi as a binary coded decimal (BCD), a four byte checksum, and empty bits for byte alignment).



bzip2 wikipedia文章还提到了位块对齐(参见文件格式部分),这似乎与我在学校内存中的内容一致(必须实现算法......)。

关于block - Bzip2 block 头 : 1AY&SY,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18262703/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com