gpt4 book ai didi

java - 并发读取文件(首选java)

转载 作者:IT老高 更新时间:2023-10-28 20:28:38 28 4
gpt4 key购买 nike

我有一个需要几个小时才能处理的大文件。所以我正在考虑尝试估计 block 并并行读取 block 。

是否可以同时读取单个文件?我查看了 RandomAccessFilenio.FileChannel 但根据其他帖子不确定这种方法是否可行。

最佳答案

这里最重要的问题是您的情况的瓶颈是什么

如果瓶颈是您的磁盘 IO,那么您在软件部分无能为力。并行计算只会让事情变得更糟,因为同时从不同部分读取文件会降低磁盘性能。

如果瓶颈是处理能力,并且您有多个 CPU 内核,那么您可以利用启动多个线程来处理文件的不同部分。您可以安全地创建多个 InputStreamReader 来并行读取文件的不同部分(只要您不超过操作系统的数量限制)打开的文件)。您可以将工作分成任务并并行运行,如下例所示:

import java.io.*;
import java.util.*;
import java.util.concurrent.*;

public class Split {
private File file;

public Split(File file) {
this.file = file;
}

// Processes the given portion of the file.
// Called simultaneously from several threads.
// Use your custom return type as needed, I used String just to give an example.
public String processPart(long start, long end)
throws Exception
{
InputStream is = new FileInputStream(file);
is.skip(start);
// do a computation using the input stream,
// checking that we don't read more than (end-start) bytes
System.out.println("Computing the part from " + start + " to " + end);
Thread.sleep(1000);
System.out.println("Finished the part from " + start + " to " + end);

is.close();
return "Some result";
}

// Creates a task that will process the given portion of the file,
// when executed.
public Callable<String> processPartTask(final long start, final long end) {
return new Callable<String>() {
public String call()
throws Exception
{
return processPart(start, end);
}
};
}

// Splits the computation into chunks of the given size,
// creates appropriate tasks and runs them using a
// given number of threads.
public void processAll(int noOfThreads, int chunkSize)
throws Exception
{
int count = (int)((file.length() + chunkSize - 1) / chunkSize);
java.util.List<Callable<String>> tasks = new ArrayList<Callable<String>>(count);
for(int i = 0; i < count; i++)
tasks.add(processPartTask(i * chunkSize, Math.min(file.length(), (i+1) * chunkSize)));
ExecutorService es = Executors.newFixedThreadPool(noOfThreads);

java.util.List<Future<String>> results = es.invokeAll(tasks);
es.shutdown();

// use the results for something
for(Future<String> result : results)
System.out.println(result.get());
}

public static void main(String argv[])
throws Exception
{
Split s = new Split(new File(argv[0]));
s.processAll(8, 1000);
}
}

关于java - 并发读取文件(首选java),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11867348/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com