gpt4 book ai didi

java - Univocity - 如何使用迭代器样式每行返回一个 bean?

转载 作者:搜寻专家 更新时间:2023-10-31 19:53:18 28 4
gpt4 key购买 nike

介绍

我正在构建一个过程来合并一些大的排序 csv 文件。我目前正在研究使用 Univocity 来执行此操作。我设置合并的方法是使用实​​现可比接口(interface)的 bean。

给出

简化后的文件如下所示:

id,data
1,aa
2,bb
3,cc

bean 看起来像这样(省略了 getter 和 setter):

public class Address implements Comparable<Address> {

@Parsed
private int id;
@Parsed
private String data;

@Override
public int compareTo(Address o) {
return Integer.compare(this.getId(), o.getId());
}
}

比较器看起来像这样:

public class AddressComparator implements Comparator<Address>{

@Override
public int compare(Address a, Address b) {
if (a == null)
throw new IllegalArgumentException("argument object a cannot be null");
if (b == null)
throw new IllegalArgumentException("argument object b cannot be null");
return Integer.compare(a.getId(), b.getId());
}
}

由于我不想读取内存中的所有数据,所以我想读取每个文件的顶部记录并执行一些比较逻辑。这是我的简化示例:

public class App {

private static final String INPUT_1 = "src/test/input/address1.csv";
private static final String INPUT_2 = "src/test/input/address2.csv";
private static final String INPUT_3 = "src/test/input/address3.csv";

public static void main(String[] args) throws FileNotFoundException {
BeanListProcessor<Address> rowProcessor = new BeanListProcessor<Address>(Address.class);
CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);
parserSettings.setHeaderExtractionEnabled(true);
CsvParser parser = new CsvParser(parserSettings);

List<FileReader> readers = new ArrayList<>();
readers.add(new FileReader(new File(INPUT_1)));
readers.add(new FileReader(new File(INPUT_2)));
readers.add(new FileReader(new File(INPUT_3)));

// This parses all rows, but I am only interested in getting 1 row as a bean.
for (FileReader fileReader : readers) {
parser.parse(fileReader);
List<Address> beans = rowProcessor.getBeans();
for (Address address : beans) {
System.out.println(address.toString());
}
}

// want to have a map with the reader and the first bean object
// Map<FileReader, Address> topRecordofReader = new HashMap<>();
Map<FileReader, String[]> topRecordofReader = new HashMap<>();
for (FileReader reader : readers) {
parser.beginParsing(reader);
String[] row;
while ((row = parser.parseNext()) != null) {
System.out.println(row[0]);
System.out.println(row[1]);
topRecordofReader.put(reader, row);
// all done, only want to get first row
break;
}
}
}
}

问题

在上面的示例中,我如何以遍历每一行并每行返回一个 bean 的方式进行解析,而不是解析整个文件?

我正在寻找这样的东西(这段无效代码只是为了表明我正在寻找的解决方案类型):

for (FileReader fileReader : readers) {
parser.beginParsing(fileReader);
Address bean = null;
while (bean = parser.parseNextRecord() != null) {
topRecordofReader.put(fileReader, bean);
}
}

最佳答案

有两种方法可以迭代读取而不是将所有内容加载到内存中,第一种是使用 BeanProcessor 而不是 BeanListProcessor:

settings.setRowProcessor(new BeanProcessor<Address>(Address.class) {
@Override
public void beanProcessed(Address address, ParsingContext context) {
// your code to process the each parsed object here!
}

为了在没有回调的情况下迭代地读取 bean(并执行一些其他常见的过程),我们创建了一个 CsvRoutines类(从 AbstractRoutines 扩展 - 更多示例 here ):

    File input = new File("/path/to/your.csv")

CsvParserSettings parserSettings = new CsvParserSettings();
//...configure the parser

// You can also use TSV and Fixed-width routines
CsvRoutines routines = new CsvRoutines(parserSettings);
for (Address address : routines.iterate(Address.class, input, "UTF-8")) {
//process your bean
}

希望这对您有所帮助!

关于java - Univocity - 如何使用迭代器样式每行返回一个 bean?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38051534/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com