gpt4 book ai didi

Java并行流性能

转载 作者:行者123 更新时间:2023-11-30 03:56:42 24 4
gpt4 key购买 nike

在尝试新的 Java 流时,我注意到与并行流的性能相关的一些奇怪的事情。我使用了一个简单的程序,它从文本文件中读取单词并计算长度 > 5 的单词(测试文件有 30000 个单词):

    String contents = new String(Files.readAllBytes(Paths.get("text.txt")));
List<String> words = Arrays.asList(contents.split("[\\P{L}]+"));
long startTime;
for (int i = 0; i < 100; i++) {
startTime = System.nanoTime();
words.parallelStream().filter(w -> w.length() > 5).count();
System.out.println("Time elapsed [PAR]: " + (System.nanoTime() - startTime));
startTime = System.nanoTime();
words.stream().filter(w -> w.length() > 5).count();
System.out.println("Time elapsed [SEQ]: " + (System.nanoTime() - startTime));
System.out.println("------------------");
}

这会在我的机器上生成以下输出(我只提到第一个和最后 5 个循环迭代):

Time elapsed [PAR]: 114185196
Time elapsed [SEQ]: 3222664
------------------
Time elapsed [PAR]: 569611
Time elapsed [SEQ]: 797113
------------------
Time elapsed [PAR]: 678231
Time elapsed [SEQ]: 414807
------------------
Time elapsed [PAR]: 755633
Time elapsed [SEQ]: 679085
------------------
Time elapsed [PAR]: 755633
Time elapsed [SEQ]: 393425
------------------
...
Time elapsed [PAR]: 90232
Time elapsed [SEQ]: 163785
------------------
Time elapsed [PAR]: 80396
Time elapsed [SEQ]: 154805
------------------
Time elapsed [PAR]: 83817
Time elapsed [SEQ]: 154377
------------------
Time elapsed [PAR]: 81679
Time elapsed [SEQ]: 186449
------------------
Time elapsed [PAR]: 68849
Time elapsed [SEQ]: 154804
------------------

为什么第一个处理比其他处理慢 100 倍?为什么并行流在第一次迭代中比顺序流慢,但在最后一次迭代中却快两倍?为什么顺序流和并行流随着时间的推移变得更快?这与循环优化有关吗?

后来编辑:根据 Luigi 的建议,我使用 JUnitBenchmarks 实现了基准测试。 :

List<String> words = null;

@Before
public void setup() {
try {
String contents = new String(Files.readAllBytes(Paths.get("text.txt")));
words = Arrays.asList(contents.split("[\\P{L}]+"));
} catch (IOException e) {
e.printStackTrace();
}
}

@BenchmarkOptions(benchmarkRounds = 100)
@Test
public void parallelTest() {
words.parallelStream().filter(w -> w.length() > 5).count();
}

@BenchmarkOptions(benchmarkRounds = 100)
@Test
public void sequentialTest() {
words.stream().filter(w -> w.length() > 5).count();
}

我还将测试文件的字数增加到 300000。新结果是:

Benchmark.sequentialTest: [measured 100 out of 105 rounds, threads: 1 (sequential)]

round: 0.08 [+- 0.04], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 62, GC.time: 1.53, time.total: 8.65, time.warmup: 0.81, time.bench: 7.85

Benchmark.parallelTest: [measured 100 out of 105 rounds, threads: 1 (sequential)]

round: 0.06 [+- 0.02], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 32, GC.time: 0.79, time.total: 6.82, time.warmup: 0.39, time.bench: 6.43

看来最初的结果是由错误的微基准配置引起的......

最佳答案

Hotspot JVM开始以解释模式执行程序,并在经过一些分析后将常用的部分编译为 native 代码。因此,循环的初始迭代通常很慢。

关于Java并行流性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22999188/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com