gpt4 book ai didi

java - 不同线程数的结果不同

转载 作者:行者123 更新时间:2023-12-01 16:48:51 25 4
gpt4 key购买 nike

我尝试以 block 的形式读取文件,并将每个 block 传递给一个线程,该线程将计算 block 中每个字节包含的次数。问题是,当我将整个文件仅传递给一个线程时,我得到正确的结果,但将其传递给多个线程时,结果变得非常奇怪。这是我的代码:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class Main{

public static void main(String[] args) throws InterruptedException, ExecutionException, IOException
{
// get number of threads to be run
Scanner in = new Scanner(System.in);
int numberOfThreads = in.nextInt();

// read file
File file = new File("testfile.txt");
long fileSize = file.length();
long chunkSize = fileSize / numberOfThreads;

FileInputStream input = new FileInputStream(file);
byte[] buffer = new byte[(int)chunkSize];

ExecutorService pool = Executors.newFixedThreadPool(numberOfThreads);
Set<Future<int[]>> set = new HashSet<Future<int[]>>();

while(input.available() > 0)
{

if(input.available() < chunkSize)
{
chunkSize = input.available();
}

input.read(buffer, 0, (int) chunkSize);

Callable<int[]> callable = new FrequenciesCounter(buffer);
Future<int[]> future = pool.submit(callable);
set.add(future);
}

// let`s assume we will use extended ASCII characters only
int alphabet = 256;

// hold how many times each character is contained in the input file
int[] frequencies = new int[alphabet];

// sum the frequencies from each thread
for(Future<int[]> future: set)
{
for(int i = 0; i < alphabet; i++)
{
frequencies[i] += future.get()[i];
}
}

input.close();

for(int i = 0; i< frequencies.length; i++)
{
if(frequencies[i] > 0) System.out.println((char)i + " " + frequencies[i]);
}
}

}

//help class for multithreaded frequencies` counting
class FrequenciesCounter implements Callable<int[]>
{
private int[] frequencies = new int[256];
private byte[] input;

public FrequenciesCounter(byte[] buffer)
{
input = buffer;
}

public int[] call()
{


for(int i = 0; i < input.length; i++)
{
frequencies[(int)input[i]]++;
}

return frequencies;
}
}

我的testfile.txt是aaaaaaaaaaaaaabbbbcccccc。对于 1 个线程,输出为:

a  14
b 4
c 6`

使用 2 个线程时,输出为:

a  4
b 8
c 12

使用 3 个线程时,输出为:

b  6
c 18

还有其他我无法弄清楚的奇怪结果。有人可以帮忙吗?

最佳答案

每个线程都使用相同的缓冲区,并且当另一个线程尝试处理该缓冲区时,一个线程将覆盖该缓冲区。

您需要确保每个线程都有自己的缓冲区,其他人无法修改。

关于java - 不同线程数的结果不同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44738628/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com