Java Hashcode 导致整数溢出

转载作者：行者123 更新时间：2023-12-02 12:25:42

28

4

背景信息:

在我的项目中，我将强化学习 (RL) 应用到 Mario 领域。对于我的状态表示，我选择使用以自定义对象作为键的哈希表。我的自定义对象是不可变的，并且已覆盖 .equals() 和 .hashcode()(由 IntelliJ IDE 生成)。

这是生成的 .hashcode()，我在注释中添加了可能的值作为额外信息:

@Override
public int hashCode() {
    int result = (stuck ? 1 : 0);                // 2 possible values: 0, 1
    result = 31 * result + (facing ? 1 : 0);     // 2 possible values: 0, 1 
    result = 31 * result + marioMode;            // 3 possible values: 0, 1, 2
    result = 31 * result + (onGround ? 1 : 0);   // 2 possible values: 0, 1 
    result = 31 * result + (canJump ? 1 : 0);    // 2 possible values: 0, 1 
    result = 31 * result + (wallNear ? 1 : 0);   // 2 possible values: 0, 1 
    result = 31 * result + nearestEnemyX;        // 33 possible values: - 16 to 16
    result = 31 * result + nearestEnemyY;        // 33 possible values: - 16 to 16

    return result;
}

问题:

这里的问题是上面代码中的结果可能超过Integer.MAX_VALUE。我在网上读到这不一定是问题，但就我而言，它是问题。这部分是由于所使用的算法是 Q-Learning(一种 RL 方法)，并且取决于哈希表中存储的正确 Q 值。基本上我在检索值时不会发生冲突。当运行我的实验时，我发现结果根本不好，并且我 95% 确定问题在于从哈希表中检索 Q 值。 (如果需要，我可以详细说明为什么我对此确信无疑，但这需要一些与该问题无关的项目的额外信息。)

问题:

有没有办法避免整数溢出，也许我在这里忽略了一些东西？或者是否有另一种方法(可能是另一种数据结构)来相当快地获得给定我的自定义键的值？

备注:

在阅读了一些评论后，我确实意识到我选择使用哈希表可能不是最好的选择，因为我想要不会导致冲突的唯一键。如果我仍然想使用哈希表，我可能需要正确的编码。

最佳答案

您需要一个专用的关键字段来保证唯一性

.hashCode() 并不是为您的用途而设计的

.hashCode() 旨在在分桶算法中提供良好的一般结果，该算法可以容忍轻微的冲突。它并不是为了提供唯一的 key 而设计的。默认算法是时间和空间以及轻微碰撞的权衡，它不应该保证唯一性。

完美哈希

您需要实现的是 perfect hash或基于对象内容的其他一些唯一键。这在 int 的边界内是可能的，但我不会使用 .hashCode() 来表示。我会在对象上使用显式键字段。

独特的哈希

一种使用标准库中内置的 SHA1 哈希的方法，对于小数据集来说，发生冲突的可能性极低。您发布到 SHA1 的值不会出现巨大的组合爆炸。

您应该能够计算出一种方法，利用您在问题中显示的有限值来生成最小完美哈希。

A minimal perfect hash function is a perfect hash function that maps n keys to n consecutive integers—usually [0..n−1] or [1..n]. A more formal way of expressing this is: Let j and k be elements of some finite set K. F is a minimal perfect hash function iff F(j) =F(k) implies j=k (injectivity) and there exists an integer a such that the range of F is a..a+|K|−1. It has been proved that a general purpose minimal perfect hash scheme requires at least 1.44 bits/key.2 The best currently known minimal perfect hashing schemes use around 2.6 bits/key.[3]

A minimal perfect hash function F is order preserving if keys are given in some order a1, a2, ..., an and for any keys aj and ak, j

A minimal perfect hash function F is monotone if it preserves the lexicographical order of the keys. In this case, the function value is just the position of each key in the sorted ordering of all of the keys. If the keys to be hashed are themselves stored in a sorted array, it is possible to store a small number of additional bits per key in a data structure that can be used to compute hash values quickly.[6]

解决方案

请注意，它在谈论 URL 时，它可以是您从对象计算出的任何 String 的任何 byte[] 表示形式。

我通常会重写 toString() 方法以使其生成唯一的内容，然后将其输入到 UUID.nameUUIDFromBytes() 方法中。

Type 3 UUID can be just as useful as well UUID.nameUUIDFromBytes()

Version 3 UUIDs use a scheme deriving a UUID via MD5 from a URL, a fully qualified domain name, an object identifier, a distinguished name (DN as used in Lightweight Directory Access Protocol), or on names in unspecified namespaces. Version 3 UUIDs have the form xxxxxxxx-xxxx-3xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, A, or B.

To determine the version 3 UUID of a given name, the UUID of the namespace (e.g., 6ba7b810-9dad-11d1-80b4-00c04fd430c8 for a domain) is transformed to a string of bytes corresponding to its hexadecimal digits, concatenated with the input name, hashed with MD5 yielding 128 bits. Six bits are replaced by fixed values, four of these bits indicate the version, 0011 for version 3. Finally, the fixed hash is transformed back into the hexadecimal form with hyphens separating the parts relevant in other UUID versions.

我的首选解决方案是 Type 5 UUID(Type 3 的 SHA 版本)

Version 5 UUIDs use a scheme with SHA-1 hashing; otherwise it is the same idea as in version 3. RFC 4122 states that version 5 is preferred over version 3 name based UUIDs, as MD5's security has been compromised. Note that the 160 bit SHA-1 hash is truncated to 128 bits to make the length work out. An erratum addresses the example in appendix B of RFC 4122.

关键对象应该是不可变的

这样你就可以计算toString()、.hashCode()并在构造函数中生成唯一的主键并设置它们一次而不是一遍又一遍地计算它们。

这是一个惯用的不可变对象(immutable对象)的稻草人示例，并根据对象的内容计算唯一键。

package com.stackoverflow;

import javax.annotation.Nonnull;
import java.util.Date;
import java.util.UUID;

public class Q23633894
{

    public static class Person
    {
        private final String firstName;
        private final String lastName;
        private final Date birthday;
        private final UUID key;
        private final String strRep;

        public Person(@Nonnull final String firstName, @Nonnull final String lastName, @Nonnull final Date birthday)
        {
            this.firstName = firstName;
            this.lastName = lastName;
            this.birthday = birthday;
            this.strRep = String.format("%s%s%d", firstName, lastName, birthday.getTime());
            this.key = UUID.nameUUIDFromBytes(this.strRep.getBytes());
        }

        @Nonnull
        public UUID getKey()
        {
            return this.key;
        }

        // Other getter/setters omitted for brevity

        @Override
        @Nonnull
        public String toString()
        {
            return this.strRep;
        }

        @Override
        public boolean equals(final Object o)
        {
            if (this == o) { return true; }
            if (o == null || getClass() != o.getClass()) { return false; }
            final Person person = (Person) o;
            return key.equals(person.key);
        }

        @Override
        public int hashCode()
        {
            return key.hashCode();
        }
    }
}

关于Java Hashcode 导致整数溢出，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23633894/

28

4

0

文章推荐： java - 我应该把 Java 程序的资源放在哪里？

文章推荐： java - 关于java中抽象类的查询

文章推荐： java - XOR (^) 交换算法如何工作？

java - 如何找到 Cassandra 导致 Spark 作业中止的根本原因(导致 ClassCastException - ShuffleMapTask 到 Task)？
我正在尝试使用 Spark 从 Cassandra 读取数据。 DataFrame rdf = sqlContext.read().option("keyspace", "readypulse
ctime() 导致 SIGABRT(？!)
这是代码: void i_log_ (int error, const char * file, int line, const char * fmt, ...) { /* Get erro
导致 Gtk 在断言时中止
我必须调试一个严重依赖 Gtk 的程序。问题是由于某些原因，在使用 GtkWindow 对象时开始出现许多运行时警告。问题是，即使 Gtk 提示严重错误，它也不会因这些错误而中止。我没有代码库的更改历
glsl - glGetProgramBinary 导致 GL_INVALID_OPERATION
我正在尝试从已有效编译和链接的程序中检索二进制文件。我已经通过 GL_PROGRAM_BINARY_LENGTH 收到了它的长度。该文档说有两个实例可能会发生 GL_INVALID_OPERATION
wcf - 导致 ServiceActivationException 的原因是什么？
我有一个托管在 Azure 环境中的服务。我正在使用控制台应用程序使用该服务。这样做时，我得到了异常: "The requested service, 'http://xxxx-d.yyyy.be/S
multithreading - sem_init() 导致 SEGV
我有以下代码，它被 SEGV 信号杀死。使用调试器表明它被 main() 中的第一个 sem_init() 杀死。如果我注释掉第一个 sem_init() ，第二个会导致同样的问题。我试图弄清楚是什么
xcode - NSJSONSerialization 导致 EXC_BAD_ACCESS
目前我正在编写一个应用程序(目标 iOS 6，启用 ARC)，它使用 JSON 进行数据传输，使用核心数据进行持久存储。 JSON 数据由 PHP 脚本通过 json_encode 从 MySQL 数
android - PopAsync 导致 ArgumentOutOfRangeException
我对 Xamarin.Forms 还是很陌生。我在出现的主页上有一个非常简单的功能 async public Task BaseAppearing() { if (UserID
android - notifyDataSetChanged() 导致 IndexOutOfBoundsException
这是我的代码的简化版本。 public class MainActivity extends ActionBarActivity { private ArrayList entry = new Arr
java - 导致 NoSuchMethodError 的显式转换？
我想弄明白为什么我的两个 Java 库很难很好地协同工作。这是场景: 库 1 有一个类 A，其构造函数如下: public A(Object obj) { /* boilerplate */ } 在以
iphone - didReceiveAuthenticationChallenge 导致 EXC_BAD_ACCESS
如果网站不需要身份验证，我的代码可以正常工作，如果需要，则在打印“已创建凭据”后会立即出现 EXC_BAD_ACCESS 错误。我不会发布任何内容，并且此代码是直接从文档中复制的 - 知道出了什么问题
iphone - NSArray 导致 EXC_BAD_ACCESS
我在使用 NSArray 填充 UITableView 时遇到问题。我确信我正在做一些愚蠢的事情，但我无法弄清楚。当我尝试进行简单的计数时，我得到了 EXC_BAD_ACCESS，我知道这是因为我试图
iphone - resignFirstResponder 导致 EXC_BAD_ACCESS
我在 UITableViewCell 上有一个 UITextField，在另一个单元格上有一个按钮。我单击 UITextField(出现键盘)。 UITextField 调用了以下方法: - (BO
iphone - MKReverseGeocoder 导致 EXC_BAD_ACCESS？
我有一个应用程序出现间歇性崩溃。崩溃日志显示了一个堆栈跟踪，这对我来说很难破译，因此希望其他人看到了这一点并能为我指出正确的方向。基本上，应用程序在启动时执行反向地理编码请求，以在标签中显示用户的位
iphone - UIImageWriteToSavedPhotosAlbum 导致 EXC_BAD_ACCESS
我开发了一个 CGImage，当程序使用以下命令将其显示在屏幕上时它工作正常: [output_view.layer performSelectorOnMainThread:@selector(set
android - EncryptedSharedPreferences 导致 UnrecoverableKeyException
我正在使用新的 EncryptedSharedPreferences以谷歌推荐的方式上课: private fun securePrefs(context: Context): SharedPrefe
javascript - ClientId 导致 NullReferenceException
我有一个中继器，里面有一些控件，其中一个是文本框。我正在尝试使用 jquery 获取文本框，我的代码如下所示: $("#").click(function (event) {}); 但我总是得到 nu
android - 导致 TTS 初始化失败的原因是什么？
在以下场景中观察到 TTS 初始化错误，太随机了。已安装 TTS 引擎，存在语音集，并且可以从辅助功能选项中播放示例 tts。 TTS 初始化在之前初始化和播放的同一设备上随机失败。在不同的设备(
java - 64位VM不启动指针压缩，导致-8内存对齐
maven pom.xml org.openjdk.jol jol-core 0.10 Java 类: public class MyObjectData { pr
math - 导致 MD5 冲突的最短字符串对是什么？
在不担心冲突的情况下，可以使用 MD5 作为哈希值，字符串长度最多为多少？这可能是通过为特定字符集中的每个可能的字符串生成 MD5 哈希来计算的，长度不断增加，直到哈希第二次出现(冲突)。没有冲突的

首页

博学

6Ren·AI

商城