gpt4 book ai didi

java - Univocity 解析器 : TextParsingException while parsing a line which has a starting double quote (") but does not have an ending double quote(")

转载 作者:搜寻专家 更新时间:2023-11-01 02:59:31 24 4
gpt4 key购买 nike

解析文件时出现异常:

com.univocity.parsers.common.TextParsingException: Length of parsed input (4097) exceeds the maximum number of characters defined in your parser settings (4096). 
Identified line separator characters in the parsed content. This may be the cause of the error. The line separator in your parser settings is set to '\r\n'. Parsed content: The quick brown fox jumps over the lazy dog.|[\n]

文件内容:

1234|5678|The quick brown fox jumps over the lazy dog.|
1234|5678|"The quick brown fox jumps over the lazy dog.|
1234|5678|The quick brown fox jumps over the lazy dog.|
1234|5678|The quick brown fox jumps over the lazy dog.|
1234|5678|The quick brown fox jumps over the lazy dog.|
.........
.........
1234|5678|The quick brown fox jumps over the lazy dog.|

我正在使用以下 CSV 解析器设置:

CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setLineSeparatorDetectionEnabled(true);
parserSettings.getFormat().setDelimiter('|');
parserSettings.setIgnoreLeadingWhitespaces(true);
parserSettings.setIgnoreTrailingWhitespaces(true);
parserSettings.setHeaderExtractionEnabled(false);
parserSettings.setMaxCharsPerColumn(4096);

我可以从异常中推断出,在第二行中我有一个起始双引号 (")。但该行不以双引号 (") 结尾。所以在这种情况下,列长度达到 EOF(文件末尾)。

测试构建:2.2.2

最佳答案

这就是 CSV 解析器的工作原理。如果找到引号,那是因为引号后面的内容可以包含定界符、行结尾或其他(希望如此)转义引号。

在您的案例中,解决这种情况的唯一方法是执行以下操作:

parserSettings.getFormat().setQuote('\0');

这将使解析器忽略引号和处理值,并将它们作为未引号的值。找到行结束符或分隔符后,将按您的预期收集该值。

关于java - Univocity 解析器 : TextParsingException while parsing a line which has a starting double quote (") but does not have an ending double quote("),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39773370/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com