gpt4 book ai didi

c# - 解析以分号分隔的文件

转载 作者:行者123 更新时间:2023-11-30 16:50:12 25 4
gpt4 key购买 nike

我有一个 CSV 文件,但分隔符是分号 ; 并且每一列都用双引号引起来。在某些值中也出现了 ;,例如 & amp;

我正在使用 TextFieldParser 来解析文件。这是示例数据:

"A001";"RT:This is a tweet"; "http://www.whatever.com/test/module & amp;one"

对于上面的例子,我得到的列/字段比我应该得到的多。

Field[0] = "A001"
Field[1] = "RT:This is a tweet"
Field[2] = "http://www.whatever.com/test/module&amp"
Field[3] = "one"

这是我的代码。需要进行哪些更改才能处理这种情况?

 using (var parser  =  new TextFieldParser(fileName))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(";");
parser.TrimWhiteSpace = true;
parser.HasFieldsEnclosedInQuotes = false;

int rowIndex = 0;
PropertyInfo[] properties = typeof(TwitterData).GetProperties();
while (parser.PeekChars(1) != null)
{
var cleanFieldRowCells = parser.ReadFields().Select(
f => f.Trim(new[] { ' ', '"' }));

var twitter = new TwitterData();
int index = 0;
foreach (string c in cleanFieldRowCells)
{
string str = c;

if (properties[index].PropertyType == typeof(DateTime))
{
string twitterDateTemplate = "ddd MMM dd HH:mm:ss +ffff yyyy";
DateTime createdAt = DateTime.ParseExact(str, twitterDateTemplate, new System.Globalization.CultureInfo("en-AU"));
properties[index].SetValue(twitter, createdAt);
}
else
{
properties[index].SetValue(twitter, str);
}

index++;
}
}

-艾伦-

最佳答案

使用上面的两个示例字符串并将 HasFieldsEnclosedInQuotes 属性设置为 true 对我有用。

string LINES = @"
""A001"";""RT:This is a tweet""; ""http://www.whatever.com/test/module&one""
""A001"";""RT: Test1 ; Test2"";""test.com"";
";
using (var sr = new StringReader(LINES))
{
using (var parser = new TextFieldParser(sr))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(";");
parser.TrimWhiteSpace = true;
parser.HasFieldsEnclosedInQuotes = true;

while (parser.PeekChars(1) != null)
{
var cleanFieldRowCells = parser.ReadFields().Select(
f => f.Trim(new[] { ' ', '"' })).ToArray();
Console.WriteLine("New Line");
for (int i = 0; i < cleanFieldRowCells.Length; ++i)
{
Console.WriteLine(
"Field[{0}] = [{1}]", i, cleanFieldRowCells[i]
);
}
Console.WriteLine("{0}", new string('=', 40));
}
}
}

输出:

New Line
Field[0] = [A001]
Field[1] = [RT:This is a tweet]
Field[2] = [http://www.whatever.com/test/module&amp;one]
========================================
New Line
Field[0] = [A001]
Field[1] = [RT: Test1 ; Test2]
Field[2] = [test.com]
Field[3] = []
========================================

关于c# - 解析以分号分隔的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35389302/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com