gpt4 book ai didi

java - 使用带引号的字段内的双引号使用 OpenCSV 解析 CSV

转载 作者:行者123 更新时间:2023-12-04 02:07:33 26 4
gpt4 key购买 nike

我正在尝试使用 OpenCSV 解析 CSV 文件。其中一列以 YAML 序列化格式存储数据并被引用,因为它可以在其中包含逗号。它里面也有引号,所以它可以通过放置两个引号来转义。我可以在 Ruby 中轻松解析此文件,但使用 OpenCSV 我无法完全解析它。它是一个 UTF-8 编码的文件。

这是我试图读取文件的 Java 片段

CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream(csvFilePath), "UTF-8"), ',', '\"', '\\');

这是该文件中的 2 行。第一行没有被正确解析并且在 ""[Fair Trade Certified]"" 处被拆分因为我猜是转义双引号。
1061658767,update,1196916,Product,28613099,Product::Source,"---
product_attributes:
-
- :name: Ornaments
:brand_id: 49120
:size: each
:alcoholic: false
:details: ""[Fair Trade Certified]""
:gluten_free: false
:kosher: false
:low_fat: false
:organic: false
:sugar_free: false
:fat_free: false
:vegan: false
:vegetarian: false
",,2015-11-01 00:06:19.796944,,,,,,
1061658768,create,,,28613100,Product::Source,"---
product_id:
retailer_id:
store_id:
source_id: 333790
locale: en_us
source_type: Product::PrehistoricProductDatum
priority: 1
is_definition:
product_attributes:
",,2015-11-01 00:06:19.927948,,,,,,

最佳答案

首先,我很高兴 FastCSV 为您工作,但我运行了可疑的子字符串并通过 3.9 openCSV 运行它,并且它与 CsvParser 和 RFC4180Parser 一起使用。您能否详细说明它如何不解析和/或使用 3.9 openCSV 进行尝试,以查看您是否遇到相同的问题,然后尝试使用以下配置。

以下是我使用的测试:

CSV解析器:

@Test
public void parseBigStringFromStackOverflowWithMultipleQuotesInLine() throws IOException {

String bigline = "28613099,Product::Source,\"---\n" +
"product_attributes:\n" +
"-\n" +
"- :name: Ornaments\n" +
" :brand_id: 49120\n" +
" :size: each\n" +
" :alcoholic: false\n" +
" :details: \"\"[Fair Trade Certified]\"\"\n" +
" :gluten_free: false\n" +
" :kosher: false\n" +
" :low_fat: false\n" +
" :organic: false\n" +
" :sugar_free: false\n" +
" :fat_free: false\n" +
" :vegan: false\n" +
" :vegetarian: false\n" +
"\",,2015-11-01 00:06:19.796944";

String suspectString = "---\n" +
"product_attributes:\n" +
"-\n" +
"- :name: Ornaments\n" +
" :brand_id: 49120\n" +
" :size: each\n" +
" :alcoholic: false\n" +
" :details: \"[Fair Trade Certified]\"\n" +
" :gluten_free: false\n" +
" :kosher: false\n" +
" :low_fat: false\n" +
" :organic: false\n" +
" :sugar_free: false\n" +
" :fat_free: false\n" +
" :vegan: false\n" +
" :vegetarian: false\n" ;

StringReader stringReader = new StringReader(bigline);

CSVReaderBuilder builder = new CSVReaderBuilder(stringReader);
CSVReader csvReader = builder.withFieldAsNull(CSVReaderNullFieldIndicator.BOTH).build();

String item[] = csvReader.readNext();

assertEquals(5, item.length);
assertEquals("28613099", item[0]);
assertEquals("Product::Source", item[1]);
assertEquals(suspectString, item[2]);
}

RFC4180解析器
def 'parse big line from stackoverflow with complex string'() {
given:
RFC4180ParserBuilder builder = new RFC4180ParserBuilder()
RFC4180Parser parser = builder.build()
String bigline = "28613099,Product::Source,\"---\n" +
"product_attributes:\n" +
"-\n" +
"- :name: Ornaments\n" +
" :brand_id: 49120\n" +
" :size: each\n" +
" :alcoholic: false\n" +
" :details: \"\"[Fair Trade Certified]\"\"\n" +
" :gluten_free: false\n" +
" :kosher: false\n" +
" :low_fat: false\n" +
" :organic: false\n" +
" :sugar_free: false\n" +
" :fat_free: false\n" +
" :vegan: false\n" +
" :vegetarian: false\n" +
"\",,2015-11-01 00:06:19.796944"

String suspectString = "---\n" +
"product_attributes:\n" +
"-\n" +
"- :name: Ornaments\n" +
" :brand_id: 49120\n" +
" :size: each\n" +
" :alcoholic: false\n" +
" :details: \"[Fair Trade Certified]\"\n" +
" :gluten_free: false\n" +
" :kosher: false\n" +
" :low_fat: false\n" +
" :organic: false\n" +
" :sugar_free: false\n" +
" :fat_free: false\n" +
" :vegan: false\n" +
" :vegetarian: false\n"

when:
String[] values = parser.parseLine(bigline)

then:
values.length == 5
values[0] == "28613099"
values[1] == "Product::Source"
values[2] == suspectString
}

关于java - 使用带引号的字段内的双引号使用 OpenCSV 解析 CSV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41948442/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com