gpt4 book ai didi

java - 这个基准比较 Java 8u40 和 8u31 乘以 BigInteger 值是否正确?

转载 作者:搜寻专家 更新时间:2023-11-01 03:03:50 25 4
gpt4 key购买 nike

我看到了JDK-8055494 : Add C2 x86 intrinsic for BigInteger::multiplyToLen() method修复了 8u40 更新,但我没想到它将 BigInteger 乘法加速 2 到 3 倍。

以下是在两个 JVM 更新上通过不同公式计算阶乘的基准测试结果:

8u31

[info] Benchmark                             (n)   Mode  Cnt       Score       Error   Units
[info] JavaFactorial.recursion 1000 thrpt 5 13.994 ± 0.175 ops/ms
[info] JavaFactorial.recursion 10000 thrpt 5 0.202 ± 0.054 ops/ms
[info] JavaFactorial.recursionPar 1000 thrpt 5 12.066 ± 8.011 ops/ms
[info] JavaFactorial.recursionPar 10000 thrpt 5 0.253 ± 0.055 ops/ms
[info] JavaFactorial.split 1000 thrpt 5 18.255 ± 2.656 ops/ms
[info] JavaFactorial.split 10000 thrpt 5 0.286 ± 0.063 ops/ms

8u40

[info] Benchmark                             (n)   Mode  Cnt       Score       Error   Units
[info] JavaFactorial.recursion 1000 thrpt 5 33.704 ± 0.445 ops/ms
[info] JavaFactorial.recursion 10000 thrpt 5 0.428 ± 0.199 ops/ms
[info] JavaFactorial.recursionPar 1000 thrpt 5 38.170 ± 0.433 ops/ms
[info] JavaFactorial.recursionPar 10000 thrpt 5 0.557 ± 0.030 ops/ms
[info] JavaFactorial.split 1000 thrpt 5 46.447 ± 11.582 ops/ms
[info] JavaFactorial.split 10000 thrpt 5 0.586 ± 0.154 ops/ms

我是否应该期望这个 JDK 改进导致运行 my code更快还是这是另一个随意的基准?

编辑

如何验证我的发现以证明这种加速是由于 C2 固有的?

测试函数代码:

@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(1)
public class JavaFactorial {
@Param({"10", "100", "1000", "10000"})
public int n;

private static ForkJoinPool pool = new ForkJoinPool();

@Benchmark
public BigInteger loop() {
return n > 20 ? loop(1, n) : BigInteger.valueOf(fastLoop(1, n));
}

@Benchmark
public BigInteger recursion() {
return n > 20 ? recursion(1, n) : BigInteger.valueOf(fastLoop(1, n));
}

@Benchmark
public BigInteger recursionPar() {
return n > 20 ? recursePar(1, n) : BigInteger.valueOf(fastLoop(1, n));
}

@Benchmark
public BigInteger split() {
return n > 180 ? split(n) : (n > 20 ? recursion(1, n) : BigInteger.valueOf(fastLoop(1, n)));
}

private long fastLoop(final int n1, int n2) {
long p = n1;
while (n2 > n1) {
p = p * n2;
n2--;
}
return p;
}

private BigInteger loop(int n1, final int n2) {
final long l = Long.MAX_VALUE >> (32 - Integer.numberOfLeadingZeros(n2));
long p = 1;
BigInteger r = BigInteger.ONE;
while (n1 <= n2) {
if (p <= l) {
p *= n1;
} else {
r = r.multiply(BigInteger.valueOf(p));
p = n1;
}
n1++;
}
return r.multiply(BigInteger.valueOf(p));
}

private BigInteger recursion(final int n1, final int n2) {
if (n2 - n1 < 65) {
return loop(n1, n2);
}
final int nm = (n1 + n2) >> 1;
return recursion(nm + 1, n2).multiply(recursion(n1, nm));
}

private BigInteger recursePar(final int n1, final int n2) {
if (n2 - n1 < 700) {
return recursion(n1, n2);
}
final int nm = (n1 + n2) >> 1;
RecursiveTask<BigInteger> t = new RecursiveTask<BigInteger>() {
protected BigInteger compute() {
return recursePar(nm + 1, n2);
}
};
if (ForkJoinTask.getPool() == pool) {
t.fork();
} else {
pool.execute(t);
}
return recursePar(n1, nm).multiply(t.join());
}

private BigInteger loop2(int n1, final int n2) {
final long l = Long.MAX_VALUE >> (32 - Integer.numberOfLeadingZeros(n2));
long p = 1;
BigInteger r = BigInteger.ONE;
while (n1 <= n2) {
if (p <= l) {
p *= n1;
} else {
r = r.multiply(BigInteger.valueOf(p));
p = n1;
}
n1 += 2;
}
return r.multiply(BigInteger.valueOf(p));
}

private BigInteger recursion2(final int n1, final int n2) {
if (n2 - n1 < 65) {
return loop2(n1, n2);
}
final int nm = ((n1 + n2) >> 1) | 1;
return recursion2(nm, n2).multiply(recursion2(n1, nm - 2));
}

private BigInteger split(int n) {
int i = 31 - Integer.numberOfLeadingZeros(n), s = -n, o = 1;
BigInteger p = BigInteger.ONE, r = BigInteger.ONE;
while (i >= 0) {
int h = n >> i;
int o1 = (h - 1) | 1;
if (o < o1) {
p = p.multiply(recursion2(o + 2, o1));
r = r.multiply(p);
}
o = o1;
s += h;
i--;
}
return r.shiftLeft(s);
}
}

最佳答案

在调查可用的 JVM 选项列表后,我发现添加到 8u40 的一个选项可以准确地打开/关闭所需的内在。

下面是 8u40 的 -XX:-UseMultiplyToLenIntrinsic 结果与 8u31 的结果非常接近:

    [info] JavaFactorial.recursion              1000  thrpt    5      12.521 ±      3.352  ops/ms
[info] JavaFactorial.recursion 10000 thrpt 5 0.217 ± 0.069 ops/ms
[info] JavaFactorial.recursionPar 1000 thrpt 5 14.268 ± 8.319 ops/ms
[info] JavaFactorial.recursionPar 10000 thrpt 5 0.286 ± 0.015 ops/ms
[info] JavaFactorial.split 1000 thrpt 5 18.768 ± 4.321 ops/ms
[info] JavaFactorial.split 10000 thrpt 5 0.255 ± 0.076 ops/ms

关于java - 这个基准比较 Java 8u40 和 8u31 乘以 BigInteger 值是否正确?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29538258/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com