gpt4 book ai didi

java - ThreadPoolExecutor 创建重复线程

转载 作者:行者123 更新时间:2023-12-02 07:02:43 26 4
gpt4 key购买 nike

我需要读取一个大的 csv 文件(328 MB)并对其进行处理。每行的处理还包括调用 Web 服务。

我是第一次使用ThreadPoolExecutor。我的逻辑是,我将从 csv 中每 100 行吐出一次,并创建一个线程来运行和处理每一行,并将结果写入 templ 文件中。一旦所有线程完成,我将读取临时文件并创建一个混合输出文件。

我的分割文件并创建线程的方法

private List<Thread> invokeWS(String csvFilename, String tempFolder) {

List<Thread> processCsvThreadList = new ArrayList<Thread>();

//Thread Pool Executer


int corePoolSize = 3;
int maximumPoolSize = 6;
long keepAliveTime = 10;
ThreadFactory threadFactory = Executors.defaultThreadFactory();


ThreadPoolExecutor thrdPoolEx = new ThreadPoolExecutor(corePoolSize,
maximumPoolSize, keepAliveTime, TimeUnit.SECONDS,
new ArrayBlockingQueue<Runnable>(2));


try {
BufferedReader bfr = new BufferedReader(new FileReader(csvFilename));
String line = "";
int i = 0;
line = bfr.readLine();
Thread csvThread;
List<String> rowList = new ArrayList<String>();


do {
line = bfr.readLine();
if (line != null) {

rowList.add(line);
i++;

if (i % 100 == 0) {

csvThread = new Thread(new ProcessCsvRow(rowList,
tempFolder));
csvThread.start();
thrdPoolEx.execute(csvThread);

rowList = new ArrayList<String>();
processCsvThreadList.add(csvThread);
}

} else {
if (null != rowList && !rowList.isEmpty()) {

csvThread = new Thread(new ProcessCsvRow(rowList,
tempFolder));
csvThread.start();
thrdPoolEx.execute(csvThread);

processCsvThreadList.add(csvThread);
}
break;
}
} while (true);




} catch (FileNotFoundException fnf) {
fnf.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
finally{
thrdPoolEx.shutdown();
}
return processCsvThreadList;
}

我的 ProcessCsvRow 类

public class ProcessCsvRow implements Runnable {

private List<String> csvRowsList;
private String tempDir;

public ProcessCsvRow(List<String> csvRowsList, String tempDir) {

this.csvRowsList = csvRowsList;
this.tempDir = tempDir;
}

@Override
public void run() {
UUID idOne = UUID.randomUUID();
FileWriter fw = null;
BufferedWriter bufferedWriter = null;
try {
String res = "";
fw = new FileWriter(new File(tempDir + "\\" + idOne.toString()+FilePropConstants.FILE_NAME_EXT_TMP));

bufferedWriter = new BufferedWriter(fw);
SentimentAnalyzer sentimentAnalyzer = new SentimentAnalyzer();

for (String csvRow : csvRowsList) {
//calling webservice for each row

res = sentimentAnalyzer.invokeSentWS(csvRow);
bufferedWriter.write(res);


}

} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (bufferedWriter != null) {
bufferedWriter.flush();
bufferedWriter.close();
}
if (fw != null) {
fw.close();
}

} catch (IOException ex) {
ex.printStackTrace();
}
}
}

}

问题是,如果对于 5 行 csv,应该创建一个临时文件,但是当我运行此程序时,我生成了两个临时文件,这是错误的。我坚信这不是一个逻辑问题,而是我实现 ThreadPoolExecuter 的方式问题。

非常感谢任何帮助。

最佳答案

您不应该创建线程,也不需要直接创建线程池。

尝试

ExecutorService es = Executors.newFixedThreadPool(8);

es.submit(runnable); // not threads

顺便说一句,每个线程都必须创建自己的输出文件,或者您需要锁定共享文件,或者您可以提交一个 Callable 并让它返回您想要记录到提交线程的内容。

关于java - ThreadPoolExecutor 创建重复线程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16455354/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com