gpt4 book ai didi

java - 在 HASHMAP 中计算两次键的哈希码

转载 作者:行者123 更新时间:2023-12-01 16:57:28 28 4
gpt4 key购买 nike

我一直在研究 Hashmap 的内部结构实现。

为了根据键从映射中添加或获取值,它将计算哈希码,然后找到存储桶位置(或表位置/索引,如果我错了,请纠正我)。
但它计算了两次哈希码。

在下面的代码片段中,key.hashcode()是对象类中的本地方法,然后在同一个类中实现hash方法。
哈希方法的注释中给出了为什么要计算两次,我无法理解。

谁能用一个场景简单解释一下吗?

int hash = hash(key.hashCode());

/ * Applies a supplemental hash function to a given hashCode, which
* defends against poor quality hash functions. This is critical
* because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}

谢谢。

最佳答案

http://tekmarathon.com/2012/12/04/hashmap-internal-implementation-analysis-in-java/

It means that if in case the algorithm we wrote for hashcode generation does not distribute/mix lower bits evenly, it will lead to more collisions. For example, we have hashcode logic of “empId*deptId” and if deptId is even, it would always generate even hashcodes because any number multiplied by EVEN is always EVEN. And if we directly depend on these hashcodes to compute the index and store our objects into hashmap then 1. odd places in the hashmap are always empty 2. because of #1, it would leave us to use only even places and hence double the number of collisions

它可以防御编写不当的哈希函数。此外,具有相似值的对象即使不一定相同也可能会导致冲突。冲突不好,它们会增加查找与键关联的值的时间,因为每个散列都指向一个值的链接列表,在检索时必须迭代该值以匹配正确的键。即使有一个好的哈希函数,您仍然需要“混合较低位”以确保二次幂分布均匀。

另请参阅:

免责声明:我在 HashMap 上大量工作了一年多,这是所有研究的来源

关于java - 在 HASHMAP 中计算两次键的哈希码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30690738/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com