gpt4 book ai didi

java - 将 Kafka 的 murmur2 实现移植到 Go

转载 作者:IT王子 更新时间:2023-10-29 01:28:01 26 4
gpt4 key购买 nike

Kafka 的 JVM 客户端正在为其默认分区程序使用 murmur2 哈希的自定义实现。

Go 的 Kafka 客户端都没有实现这种哈希算法,当您需要在不同平台上的不同客户端之间保持一致分区时,这会带来各种问题。

我正在尝试将此代码移植到 Go,它似乎适用于某些值,但不适用于其他值。

这是 Java 代码(来源在这里:https://github.com/apache/kafka/blob/1.0.0/clients/src/main/java/org/apache/kafka/common/utils/Utils.java#L353):

public static int murmur2(final byte[] data) {
int length = data.length;
int seed = 0x9747b28c;
// 'm' and 'r' are mixing constants generated offline.
// They're not really 'magic', they just happen to work well.
final int m = 0x5bd1e995;
final int r = 24;

// Initialize the hash to a random value
int h = seed ^ length;
int length4 = length / 4;

for (int i = 0; i < length4; i++) {
final int i4 = i * 4;
int k = (data[i4 + 0] & 0xff) + ((data[i4 + 1] & 0xff) << 8) + ((data[i4 + 2] & 0xff) << 16) + ((data[i4 + 3] & 0xff) << 24);
k *= m;
k ^= k >>> r;
k *= m;
h *= m;
h ^= k;
}

// Handle the last few bytes of the input array
switch (length % 4) {
case 3:
h ^= (data[(length & ~3) + 2] & 0xff) << 16;
case 2:
h ^= (data[(length & ~3) + 1] & 0xff) << 8;
case 1:
h ^= data[length & ~3] & 0xff;
h *= m;
}

h ^= h >>> 13;
h *= m;
h ^= h >>> 15;

return h;
}

这是 Go 代码( Playground 链接:https://play.golang.org/p/K4VooLZ4Mp7):

package main

import "fmt"

func main() {
cases := []struct {
Input []byte
Expected int32
}{
{[]byte("21"), -973932308},
{[]byte("foobar"), -790332482}, // outputs: 1518714010
{[]byte("a-little-bit-long-string"), -985981536}, // outputs 2068422364
{[]byte("a-little-bit-longer-string"), -1486304829}, // outputs 1797390322
{[]byte("lkjh234lh9fiuh90y23oiuhsafujhadof229phr9h19h89h8"), -58897971}, // outputs -1332218133
{[]byte{'a', 'b', 'c'}, 479470107},
}

for _, c := range cases {
if res := murmur2(c.Input); res != c.Expected {
fmt.Printf("input: %q, expected: %d, got: %d\n", c.Input, c.Expected, res)
}
}
}

func murmur2(data []byte) int32 {
length := int32(len(data))
seed := uint32(0x9747b28c)
m := int32(0x5bd1e995)
r := uint32(24)

h := int32(seed ^ uint32(length))
length4 := length / 4

for i := int32(0); i < length4; i++ {
i4 := i * 4
k := int32(data[i4+0]&0xff) + (int32(data[i4+1]&0xff) << 8) + (int32(data[i4+2]&0xff) << 16) + (int32(data[i4+3]&0xff) << 24)
k ^= int32(uint32(k) >> r)
k *= m
h *= m
h ^= k
}

switch length % 4 {
case 3:
h ^= int32(data[(length & ^3)+2]&0xff) << 16
fallthrough
case 2:
h ^= int32(data[(length & ^3)+1]&0xff) << 8
fallthrough
case 1:
h ^= int32(data[length & ^3] & 0xff)
h *= m
}

h ^= int32(uint32(h) >> 13)
h *= m
h ^= int32(uint32(h) >> 15)

return h
}

我使用提到的 Utils 类从 Java 生成了 Go 测试的预期值,如下所示:

System.out.println(Utils.murmur2("a-little-bit-long-string".getBytes("UTF-8")))

我所见过的 Go 的现有 murmur2 实现都没有生成与上述 Java 代码相同的结果。

问题是,我怎样才能将提到的 Java 代码移植到 Go 中,以便两者的结果相同?

最佳答案

正如@IskanderSharipov 所指出的:

Go version misses one multiplication statement: k *= m inside the loop

关于java - 将 Kafka 的 murmur2 实现移植到 Go,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48582589/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com