gpt4 book ai didi

java - 使用多线程读取单个文件 : should speed up?

转载 作者:塔克拉玛干 更新时间:2023-11-03 04:04:18 25 4
gpt4 key购买 nike

我正在读取一个包含 500000 行的文件。我正在测试多线程如何加速进程....

private void multiThreadRead(int num){

for(int i=1; i<= num; i++) {
new Thread(readIndivColumn(i),""+i).start();
}
}

private Runnable readIndivColumn(final int colNum){
return new Runnable(){
@Override
public void run() {
// TODO Auto-generated method stub
try {

long startTime = System.currentTimeMillis();
System.out.println("From Thread no:"+colNum+" Start time:"+startTime);

RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
String line = "";
//System.out.println("From Thread no:"+colNum);

while((line = raf.readLine()) != null){
//System.out.println(line);
//System.out.println(StatUtils.getCellValue(line, colNum));
}


long elapsedTime = System.currentTimeMillis() - startTime;

String formattedTime = String.format("%d min, %d sec",
TimeUnit.MILLISECONDS.toMinutes(elapsedTime),
TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime))
);

System.out.println("From Thread no:"+colNum+" Finished Time:"+formattedTime);
}
catch (Exception e) {
// TODO Auto-generated catch block
System.out.println("From Thread no:"+colNum +"===>"+e.getMessage());

e.printStackTrace();
}
}
};
}

private void sequentialRead(int num){
try{
long startTime = System.currentTimeMillis();
System.out.println("Start time:"+startTime);

for(int i =0; i < num; i++){
RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
String line = "";

while((line = raf.readLine()) != null){
//System.out.println(line);
}
}

long elapsedTime = System.currentTimeMillis() - startTime;

String formattedTime = String.format("%d min, %d sec",
TimeUnit.MILLISECONDS.toMinutes(elapsedTime),
TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime))
);

System.out.println("Finished Time:"+formattedTime);
}
catch (Exception e) {
e.printStackTrace();
// TODO: handle exception
}

}
public TesterClass() {

sequentialRead(1);
this.multiThreadRead(1);

}

对于 num = 1 我得到以下结果:

开始时间:1326224619049

完成时间:2 分 14 秒

顺序读取结束......

多线程读取开始:

来自线程号:1 开始时间:1326224753606

从线程号:1 完成时间:2 分 13 秒

多线程读取ENDS.....

对于 num = 5 我得到以下结果:

    formatted Time:10 min, 20 sec

Sequential read ENDS...........

Multi-Thread read starts:

From Thread no:1 Start time:1326223509574
From Thread no:3 Start time:1326223509574
From Thread no:4 Start time:1326223509574
From Thread no:5 Start time:1326223509574
From Thread no:2 Start time:1326223509574
From Thread no:4 formatted Time:5 min, 54 sec
From Thread no:2 formatted Time:6 min, 0 sec
From Thread no:3 formatted Time:6 min, 7 sec
From Thread no:5 formatted Time:6 min, 23 sec
From Thread no:1 formatted Time:6 min, 23 sec
Multi-Thread read ENDS.....

我的问题是:多线程读取不应该花费大约。 2.13 秒?您能否解释一下为什么多线程解决方案花费的时间太长?

提前致谢。

最佳答案

您看到并行读取速度变慢的原因是因为磁头需要为每个线程寻找下一个读取位置(大约需要 5 毫秒)。因此,使用多个线程读取有效地在寻道之间反弹磁盘,从而减慢速度。从单个磁盘读取文件的唯一推荐方法是使用一个线程顺序读取。

关于java - 使用多线程读取单个文件 : should speed up?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8809894/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com