gpt4 book ai didi

Java 基准测试 : Ensuring that objects are not reused after coming out of scope

转载 作者:塔克拉玛干 更新时间:2023-11-03 04:45:48 25 4
gpt4 key购买 nike

我正在对几种算法进行基准测试,这些算法处理第 k 个最近邻问题的变体。在重复运行对我的数据进行排序的算法时,我看到了令人不安的结果。

更新

似乎数据本身并没有像人们所说的那样被缓存。我运行了相同的 n = 50000 测试,但每次运行三个测试时都会在两个 ArrayList 中生成随机点。我担心的加速仍然发生了。这似乎是 JITC 的功能。

输出:

Stopping :: Brute Force. Time (ms): 21716
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 24014
Stopping :: Quickselect. Time (ms): 17655

Stopping :: Brute Force. Time (ms): 22034
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 23975
Stopping :: Quickselect. Time (ms): 18438

Stopping :: Brute Force. Time (ms): 20097
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15677
Stopping :: Quickselect. Time (ms): 14399

Stopping :: Brute Force. Time (ms): 20457
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15141
Stopping :: Quickselect. Time (ms): 14146

Stopping :: Brute Force. Time (ms): 20143
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15834
Stopping :: Quickselect. Time (ms): 14084

Stopping :: Brute Force. Time (ms): 20173
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15170
Stopping :: Quickselect. Time (ms): 13745

Stopping :: Brute Force. Time (ms): 19625
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 14924
Stopping :: Quickselect. Time (ms): 15972

Stopping :: Brute Force. Time (ms): 19388
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15209
Stopping :: Quickselect. Time (ms): 13639

Stopping :: Brute Force. Time (ms): 19420
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 14779
Stopping :: Quickselect. Time (ms): 13798

Stopping :: Brute Force. Time (ms): 19390
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15078
Stopping :: Quickselect. Time (ms): 13548

原始查询

背景/细节:

我有两个未排序的 n 大小的点数组列表 - A 和 B。对于 A 中的每个点 P,检查 B 中有多少点在距 P 的距离 d 内。

算法:

  • 蛮力
    • 对于 A 中的每个点 P1,检查 B 中的每个点 P2。
  • 经过排序的智能暴力破解
    • 创建一个临时的ArrayList C,并将B深拷贝到其中。
    • 按点的 x 值对 C 排序。
    • 扫描B直到到达第一个超过(P1.x + d)的P2.x
  • 快速选择
    • 创建一个临时的ArrayList C,并将B深拷贝到其中。
    • 按点的 x 值对 C 排序。
    • 对于A中的每个点P1,通过C进行二分查找,直到找到一个点P2,其x值在P1.x±d范围内。
    • 从找到的点向左和向右扫描。当到达位于 d 范围之外的点时停止扫描。

示例:

public void quickSelect(ArrayList<Point> field1, ArrayList<Point> input2, int distance){
ArrayList<Point> field2 = new ArrayList<Point>();

deepCopy(field2, input2);

startTimer("Starting :: Quickselect.\n");
Collections.sort(field2, new ComparePoints());

for(int i = 0; i < field1.size(); ++i){
//Find pivot
...
//Scan to its left until out of bounds.
...
//Scan to its right until out of bounds.
...
}

endTimer("Stopping :: Quickselect. ");

field2.clear();
}

deepCopy 在哪里:

private void deepCopy(ArrayList<Point> field1, ArrayList<Point> input1){
for(int i = 0; i < input1.size(); ++i){
field1.add(input1.get(i).getLocation());
}
}

测试方法:

public static void main(String[] args){
int points = 50000;
ArrayList<Point> field1 = generator.makeGraph(points);
ArrayList<Point> field2 = generator.makeGraph(points);
GraphTester tester = new GraphTester(field1);
int maxDistance = 300;

for(int i = 0; i < 10; ++i){
tester.bruteForce(field1, field2, maxDistance);
}

for(int i = 0; i < 10; ++i){
tester.sortedBruteForce(field1, field2, maxDistance);
}

for(int i = 0; i < 10; ++i){
tester.quickSelect(field1, field2, maxDistance);
}
}

结果:

Stopping :: Brute Force. Time (ms): 22851
Stopping :: Brute Force. Time (ms): 22482
Stopping :: Brute Force. Time (ms): 20690
Stopping :: Brute Force. Time (ms): 21073
Stopping :: Brute Force. Time (ms): 20860
Stopping :: Brute Force. Time (ms): 21311
Stopping :: Brute Force. Time (ms): 20847
Stopping :: Brute Force. Time (ms): 21000
Stopping :: Brute Force. Time (ms): 20503
Stopping :: Brute Force. Time (ms): 21342

Stopping :: Sorted 'Smart' Brute Force. Time (ms): 23083
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 22616
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15881
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15323
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15930
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15360
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 16072
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15601
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 16952
Stopping :: Sorted 'Smart' Brute Force. Time (ms): 15950

Stopping :: Quickselect. Time (ms): 18202
Stopping :: Quickselect. Time (ms): 18109
Stopping :: Quickselect. Time (ms): 14685
Stopping :: Quickselect. Time (ms): 14401
Stopping :: Quickselect. Time (ms): 14052
Stopping :: Quickselect. Time (ms): 14782
Stopping :: Quickselect. Time (ms): 14175
Stopping :: Quickselect. Time (ms): 14187
Stopping :: Quickselect. Time (ms): 13870
Stopping :: Quickselect. Time (ms): 14601

如您所见,Sorted Smart Brute Force 和 Quickselect 算法开始时的计算时间较长,但随着测试的重复,逐渐趋于稳定到更好的数字。

虽然我正在清除临时 ArrayList,但我觉得排序后的 ArrayList 以某种方式被 JVM 保留并重新使用/重新映射到新的 ArrayList 指针。

我的问题有两个方面,那么:

  • 这看起来像是缓存的情况吗?
  • 我能保证数据不被缓存吗?
    • 如果我每次生成图表时都随机化这些点,我敢打赌会有更多控制,但我无法确定算法是否完全匹配。

最佳答案

我不明白你为什么要深度克隆数据。

Does this seem like a case of caching?

不,是 JITC 在您运行的内容上做得更好。每个重要的 Java 基准测试都需要预热,在此期间收集统计数据,然后对代码的相关部分(HotSpot)进行大量编译和优化。

更复杂的算法需要更多的预热。

Can I guarantee that the data is not cached?

我在那里看不到任何缓存。谁会做? JITC 无法理解你在做什么。如果它知道您一直在计算相同的东西,那么除一次之外的所有时间都将为零。

If I randomize the Points every time I generate the graphs, I'm betting there would be more control, but then I couldn't be sure that the algorithms are fairly matched.

如果你有这种感觉,那就去吧。我想您可以宁愿相信您的基准测试也无需这样做。

评论后更新

预热:常见的情况是服务器运行许多小时(至少),一直在执行大部分相同的操作。对于这样的服务器,在最初几分钟内测量性能毫无意义。另一种需要预热的情况是测量或优化非常快速的操作,例如String.hashCode,通常重复数千次。单次长时间计算的情况似乎是个异常(exception)。

不过,我写的关于热身的内容是对您的观察的合理解释。总之,Quickselect 在所有运行中都是赢家,所以选择很明确。它后来有所改善的事实对获胜者来说是不错的奖励。接下来的位置不太清楚,但 22851 与 23083 并不是真正的胜利,因为测量误差更大(因此可能是深度复制的成本)。如果您想了解更多(1 次迭代就足够了),请多次重新运行基准测试(使用新的 JVM)。

关于Java 基准测试 : Ensuring that objects are not reused after coming out of scope,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24317685/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com