gpt4 book ai didi

hadoop - VIntWritable 与 IntWritable

转载 作者:可可西里 更新时间:2023-11-01 16:26:19 26 4
gpt4 key购买 nike

我明白 VIntWritableIntWritable 相比,可以显着减少存储整数所需的大小.

我的问题是:使用 VIntWritable 而不是 IntWritable 的成本是多少?是(仅)压缩所需的时间吗?换句话说,什么时候应该使用 IntWritable 而不是 VIntWritable?

最佳答案

How do you choose between a fixed-length and a variable-length encoding?

Fixedlength encodings are good when the distribution of values is fairly uniform across the whole value space, such as a (well-designed) hash function. Most numeric variables tend to have nonuniform distributions, and on average the variable-length encoding will save space. Another advantage of variable-length encodings is that you can switch from VIntWritable to VLongWritable, because their encodings are actually the same. So by choosing a variable-length representation, you have room to grow without committing to an 8-byte long representation from the beginning.

我刚从 definitive guide book 上捡到这个第98页

关于hadoop - VIntWritable 与 IntWritable,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23672119/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com