gpt4 book ai didi

java - 读取 .tsv 文件时跳过备用行

转载 作者:行者123 更新时间:2023-12-01 19:29:28 28 4
gpt4 key购买 nike

我有一个 .tsv 文件,有 39 列最后一列的数据为字符串,长度超过 100,000 个字符现在发生的事情是当我尝试读取文件第 1 行有标题时,然后是数据

发生的事情是在读取第 1 行之后,它转到第 3 行,然后是第 5 行,然后是第 7 行尽管所有行都有相同的数据按照我得到的日志

lineNo=3, rowNo=2, customer=503837-100 , last but one cell length=111275
lineNo=5, rowNo=3, customer=503837-100 , last but one cell length=111275
lineNo=7, rowNo=4, customer=503837-100 , last but one cell length=111275
lineNo=9, rowNo=5, customer=503837-100 , last but one cell length=111275
lineNo=11, rowNo=6, customer=503837-100 , last but one cell length=111275
lineNo=13, rowNo=7, customer=503837-100 , last but one cell length=111275
lineNo=15, rowNo=8, customer=503837-100 , last but one cell length=111275
lineNo=17, rowNo=9, customer=503837-100 , last but one cell length=111275
lineNo=19, rowNo=10, customer=503837-100 , last but one cell length=111275

以下是我的代码:

import java.io.FileReader;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanReader;
import org.supercsv.io.ICsvBeanReader;
import org.supercsv.prefs.CsvPreference;

public class readWithCsvBeanReader {
public static void main(String[] args) throws Exception{
readWithCsvBeanReader();
}


private static void readWithCsvBeanReader() throws Exception {

ICsvBeanReader beanReader = null;

try {

beanReader = new CsvBeanReader(new FileReader("C:\MAP TSV\abc.tsv"), CsvPreference.TAB_PREFERENCE);
// the header elements are used to map the values to the bean (names must match)
final String[] header = beanReader.getHeader(true);
final CellProcessor[] processors = getProcessors();
TSVReaderBrandDTO tsvReaderBrandDTO = new TSVReaderBrandDTO();

int i = 0;
int last = 0;

while( (tsvReaderBrandDTO = beanReader.read(TSVReaderBrandDTO.class, header, processors)) != null ) {
if(null == tsvReaderBrandDTO.getPage_cache()){
last = 0;
}
else{
last = tsvReaderBrandDTO.getPage_cache().length();
}
System.out.println(String.format("lineNo=%s, rowNo=%s, customer=%s , last but one cell length=%s", beanReader.getLineNumber(),
beanReader.getRowNumber(), tsvReaderBrandDTO.getUnique_ID(), last));
i++;
}

System.out.println("Number of rows : "+i);

}
finally {
if( beanReader != null ) {
beanReader.close();
}
}
}

private static CellProcessor[] getProcessors() {

final CellProcessor[] processors = new CellProcessor[] {
new Optional(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
new NotNull(), new NotNull(), new NotNull(), new Optional()};

return processors;
}
}

请告诉我哪里出错了

最佳答案

如果您使用 CSV 解析器来解析 TSV 输入,您将会遇到麻烦。使用正确的 TSV 解析器。 uniVocity-parsers配有 TSV 解析器/编写器。您还可以使用带注释的 java beans 将文件直接解析为类的实例。

示例:

此代码将 TSV 解析为行。

TsvParserSettings settings = new TsvParserSettings();

// creates a TSV parser
TsvParser parser = new TsvParser(settings);

// parses all rows in one go.
List<String[]> allRows = parser.parseAll(new FileReader(yourFile));

使用 BeanListProcessor 解析为 java beans:

BeanListProcessor<TestBean> rowProcessor = new BeanListProcessor<TestBean>(TestBean.class);

TsvParserSettings parserSettings = new TsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);

TsvParser parser = new TsvParser(parserSettings);
parser.parse(new FileReader(yourFile));

// The BeanListProcessor provides a list of objects extracted from the input.
List<TestBean> beans = rowProcessor.getBeans();

TestBean 类如下所示: 类 TestBean {

// if the value parsed in the quantity column is "?" or "-", it will be replaced by null.
@NullString(nulls = { "?", "-" })
// if a value resolves to null, it will be converted to the String "0".
@Parsed(defaultNullRead = "0")
private Integer quantity;


@Trim
@LowerCase
@Parsed(index = 4)
private String comments;

// you can also explicitly give the name of a column in the file.
@Parsed(field = "amount")
private BigDecimal amount;

@Trim
@LowerCase
// values "no", "n" and "null" will be converted to false; values "yes" and "y" will be converted to true
@BooleanString(falseStrings = { "no", "n", "null" }, trueStrings = { "yes", "y" })
@Parsed
private Boolean pending;

披露:我是这个库的作者。它是开源且免费的(Apache V2.0 许可证)。

关于java - 读取 .tsv 文件时跳过备用行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21180552/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com