gpt4 book ai didi

c# 读取格式不正确的 XML 文件

转载 作者:行者123 更新时间:2023-11-30 21:38:12 25 4
gpt4 key购买 nike

我必须读取一个没有根元素的 XML 文件来提取包含的数据。 XML 有很多这样的元素:

<DocumentElement>
<LOG_x0020_ParityRate>
<DATE>12/09/2017 - 00:00</DATE>
<CHANNELNAME>ParityRate</CHANNELNAME>
<SQL>update THROOMDISP set ID_HOTEL = '104', ID_ROOM = '920', NUM = '3', MYDATA = '20171006' where id_hotel =104 and id_room ='920' and MYDATA ='20171006'</SQL>
<ID_HOTEL>104</ID_HOTEL>
<TYPEREQUEST>updateTHROOMDISP(OK)</TYPEREQUEST>
</LOG_x0020_ParityRate>
</DocumentElement><DocumentElement>
<LOG_x0020_ParityRate>
<DATE>12/09/2017 - 00:00</DATE>
<CHANNELNAME>ParityRate</CHANNELNAME>
<SQL>update THROOMDISP set ID_HOTEL = '105', ID_ROOM = '923', NUM = '1', MYDATA = '20171006' where id_hotel =105 and id_room ='923' and MYDATA ='20171006'</SQL>
<ID_HOTEL>105</ID_HOTEL>
<TYPEREQUEST>updateTHROOMDISP(OK)</TYPEREQUEST>
</LOG_x0020_ParityRate>
</DocumentElement><DocumentElement>
<LOG_x0020_ParityRate>
<DATE>12/09/2017 - 00:00</DATE>
<CHANNELNAME>ParityRate</CHANNELNAME>
<SQL>update THROOMDISP set ID_HOTEL = '104', ID_ROOM = '920', NUM = '3', MYDATA = '20171007' where id_hotel =104 and id_room ='920' and MYDATA ='20171007'</SQL>
<ID_HOTEL>104</ID_HOTEL>
<TYPEREQUEST>updateTHROOMDISP(OK)</TYPEREQUEST>
</LOG_x0020_ParityRate>
</DocumentElement><DocumentElement>

我尝试将其作为字符串读取,手动添加开始和结束标签,并像 XDocument 一样解析它,但它也有一些格式错误的标签,例如这些

</DocumentElement>
<TYPEREQUEST>updateTHROOMPRICE(OK)</TYPEREQUEST>

如果这些标签与任何开始标签都不匹配,并且当我对生成的字符串调用 XDocument.Parse 时,我有异常。该文件有数百万行,所以我无法逐行读取它,否则迭代将持续数小时。我怎样才能摆脱所有这些格式错误的标签并解析文档?

最佳答案

您的 xml 格式不正确,这在将 xml 数据合并在一起时经常发生。您的 xml 在根级别有多个标记,因此请使用如下所示的 XML 阅读器:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;


namespace ConsoleApplication4
{
class Program
{
const string FILENAME = @"c:\temp\test.xml";
static void Main(string[] args)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
XmlReader reader = XmlReader.Create(FILENAME,settings);
while (!reader.EOF)
{
try
{
if (reader.Name != "LOG_x0020_ParityRate")
{
reader.ReadToFollowing("LOG_x0020_ParityRate");
}
if (!reader.EOF)
{
XElement parityRate = (XElement)XElement.ReadFrom(reader);

ParityRate newLog = new ParityRate();
ParityRate.logs.Add(newLog);
newLog.date = DateTime.ParseExact((string)parityRate.Element("DATE"), "MM/dd/yyyy - hh:mm", System.Globalization.CultureInfo.InvariantCulture);
newLog.name = (string)parityRate.Element("CHANNELNAME");
newLog.sql = (string)parityRate.Element("SQL");
newLog.hotel = (int)parityRate.Element("ID_HOTEL");
}
}
catch (Exception ex)
{
}
}
}
}
public class ParityRate
{
public static List<ParityRate> logs = new List<ParityRate>();

public DateTime date { get; set; }
public string name { get; set; }
public string sql { get; set; }
public int hotel { get; set; }
}
}

关于c# 读取格式不正确的 XML 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46220779/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com