gpt4 book ai didi

java - 如何防止分块数据处理的while循环中的代码重复?

转载 作者:行者123 更新时间:2023-12-05 00:13:32 25 4
gpt4 key购买 nike

我正在读取一个文件,收集处理过的行,并在每个收集 block 之后将它们分批写入(例如写入文件或数据库)。

当循环终止(= 文件被完全读取)时,我必须再次调用 writer。否则我不会 catch 最后一 block 。

问题:我能否以某种方式改进代码以防止重复额外的 write() 调用?

List<String> collect = new ArrayList<>();

String line;
while ((line = reader.read()) != null) {
String processed = processline(line);
collect.add(processed);

//write each x chunks to file
if (collect.size() % 1000 == 0) {
writer.write(collect);
collect = new ArrayList<>();
}
}

//can I prevent repetition here?
if (!collect.isEmpty()) {
writer.write(collect);
}

最佳答案

将缓冲逻辑(因为这就是您正在做的事情,缓冲)封装在一个单独的类中。但是当缓冲区太大时,您总是必须写入,并且当您完成读取时。

class BufferingWriter implements Closeable {
private List<String> buffer = new ArrayList<>(1000);
private MyWriter writer;

public void write(String line) {
buffer.add(line);
if (buffer.size() >= 1000) {
flush();
}
}

public void flush() {
writer.write(buffer);
buffer.clear();
}

@Override
public void close() throws IOException {
flush();
// TBD: Pass the close call onto MyWriter if that is possible
// or otherwise flag this writer as closed
}
}
List<String> collect = new ArrayList<>();
try (BufferingWriter bwriter = new BufferingWriter(writer)) {
String line;
while ((line = reader.read()) != null) {
String processed = processline(line);
bwriter.write(line);
}
}

关于java - 如何防止分块数据处理的while循环中的代码重复?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45736332/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com