gpt4 book ai didi

java - 单线程与多线程 : Mergesort performance comparison

转载 作者:行者123 更新时间:2023-12-01 11:07:49 27 4
gpt4 key购买 nike

谁能看一下下面的代码片段吗?

它是一个类,通过将单词数组拆分为 10 个子数组来测试合并排序的多线程解决方案,而不是单线程解决方案并将整个数组作为参数传递。

出于某种原因,只有当数组大小为 100 万时,多线程版本才效果更好,而当我将其增加到 1000 万时,效果就很糟糕(比单线程解决方案多了 2 秒)。

可能是因为串联有序列表的最后合并步骤,第 120 行。

我猜想,到了某个时候,拆分数组、使用 10 个线程来处理数据集并对合并结果进行排序就不再具有性能优势了。

我知道这不是一个合适的基准测试,并且跳过 JVM 预热步骤可能会产生一点影响。

数组大小的平均次数 = 100 万次:

单线程:450ms

多线程:350ms

数组大小的平均次数 = 1000 万次:

单线程:3.5s

多线程:5s

package brandao;

import com.fnf.xes.services.test.util.generators.impl.StupidDictionary;
import org.apache.commons.io.FileUtils;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

/**
* Created with IntelliJ IDEA.
* User: ostras
* Date: 23/09/15
* Time: 14:16
*
* Class to test multithreaded solution of mergesort by splitting array of words into 10 sub-arrays, vs single threaded solution and passing the whole array as argument.
* For some reason only works better for multiT when the size of the array is 1000000.
* Probably because of final merge step of the concatenated ordered lists, line 120.
* At some point there is no more performance advantage of splitting the array and using 10 threads to process the dataset.
*/
public class ThreadTest {

private static StupidDictionary stupidDictionary = new StupidDictionary();
private static ArrayList<String> list = new ArrayList<String>();
private static ArrayList<String> list2 = new ArrayList<String>();


public static void main(String[] args) {
ThreadTest test = new ThreadTest();
int length = 10000000;

stupidDictionary.loadDefault();

for (int i = 0; i <= length; i++) {
String word = stupidDictionary.getRandomWord();
list.add(word);
list2.add(word);
}

//System.out.println("The list is : " + list.toString());

long startTime = System.currentTimeMillis();

test.testSingleThread(list);

long stopTime = System.currentTimeMillis();
long elapsedTime = stopTime - startTime;
System.out.println("Single Thread : " + elapsedTime + "ms");
//System.out.println("Sorted list : " + list);
/*
try{
FileUtils.writeLines(new File("textdata.txt"), list);
} catch (IOException ioe) {

}
*/

startTime = System.currentTimeMillis();

test.testMultiThread(list2);

stopTime = System.currentTimeMillis();
elapsedTime = stopTime - startTime;
System.out.println("Multi Thread : " + elapsedTime + "ms");
//System.out.println("Sorted list 2 : " + test.testMultiThread(list2));
}

public void testSingleThread(ArrayList<String> array) {
mergeSort(array);
}

public ArrayList<String> testMultiThread(final ArrayList<String> obj) {
/* MyThread rum = new MyThread(obj.subList(0 , 100000));
MyThread rdois = new MyThread(obj.subList(100000 , 200000));
MyThread rtres = new MyThread(obj.subList(200000 , 300000));
MyThread rquatro = new MyThread(obj.subList(300000 , 400000));
MyThread rcinco = new MyThread(obj.subList(400000 , 500000));
MyThread rseis = new MyThread(obj.subList(500000 , 600000));
MyThread rsete = new MyThread(obj.subList(600000 , 700000));
MyThread roito = new MyThread(obj.subList(700000 , 800000));
MyThread rnove = new MyThread(obj.subList(800000 , 900000));
MyThread rdez = new MyThread(obj.subList(900000 , 1000001));*/

MyThread rum = new MyThread(obj.subList(0, 1000000));
MyThread rdois = new MyThread(obj.subList(1000000, 2000000));
MyThread rtres = new MyThread(obj.subList(2000000, 3000000));
MyThread rquatro = new MyThread(obj.subList(3000000, 4000000));
MyThread rcinco = new MyThread(obj.subList(4000000, 5000000));
MyThread rseis = new MyThread(obj.subList(5000000, 6000000));
MyThread rsete = new MyThread(obj.subList(6000000, 7000000));
MyThread roito = new MyThread(obj.subList(7000000, 8000000));
MyThread rnove = new MyThread(obj.subList(8000000, 9000000));
MyThread rdez = new MyThread(obj.subList(9000000, 10000001));

new Thread(rum).start();
new Thread(rdois).start();
new Thread(rtres).start();
new Thread(rquatro).start();
new Thread(rcinco).start();
new Thread(rseis).start();
new Thread(rsete).start();
new Thread(roito).start();
new Thread(rnove).start();
new Thread(rdez).start();

ArrayList<String> result = new ArrayList<String>();
result.addAll(rum.getToSort());
result.addAll(rdois.getToSort());
result.addAll(rtres.getToSort());
result.addAll(rquatro.getToSort());
result.addAll(rcinco.getToSort());
result.addAll(rseis.getToSort());
result.addAll(rsete.getToSort());
result.addAll(roito.getToSort());
result.addAll(rnove.getToSort());
result.addAll(rdez.getToSort());

mergeSort(result);

return result;
}

public void mergeSort(List<String> obj) {
Collections.sort(obj);
}

public class MyThread implements Runnable {
List<String> toSort;

public MyThread(List<String> toSort) {
this.toSort = toSort;
}

public void run() {
mergeSort(toSort);
}

public List<String> getToSort() {
return this.toSort;
}
}
}

最佳答案

合并排序应该受到内存带宽的限制,但是每个核心中的缓存可能会有所帮助。也许尝试使用 4 或 8 个线程而不是 10 个。排序完成后,合并 4 或 8 个部分,而不是再次连接和排序。

我不确定这会有多大帮助。使用 C/C++,在我的系统 Intel 2600K (3.4 ghz) 上,对 400 万个 64 位无符号整数进行合并排序只需不到 1/2 秒。

关于java - 单线程与多线程 : Mergesort performance comparison,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32742085/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com