gpt4 book ai didi

java - 如何解析 CDATA 部分中带有 HTML 标签的 XML 文件?

转载 作者:行者123 更新时间:2023-11-30 06:50:53 24 4
gpt4 key购买 nike

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<extendedinfo type="html">
<![CDATA[<table class="ResultTable" cellpadding=2 cellspacing=1 border=0><tr class="TableHeadingLine"><th bgcolor="#b3b3b3" align="left" colspan="6"><font face="arial, verdana, trebuchet, officina, sans-serif" size="+2"><B>Testcase: Init Testreport</B></font></th></tr><tr class="TableHeadingLine"><th class="TableHeadingCell" width="120px"></th><th class="TableHeadingCell" width="120px"></th><th class="TableHeadingCell" width="80px"></th><th class="TableHeadingCell" width="345px"></th><th class="TableHeadingCell" width="345px"></th><th class="TableHeadingCell" width="70px"></th></tr>]]>
</extendedinfo>
<extendedinfo type="html">
<![CDATA[<tr><td class="DefineCell">58.675124</td><td class="DefaultCell" colspan="5"><i><font color="#008000">Set_Temperature is set to 23</font></i><br>Set_Temperature = 23</td></tr>]]>
</extendedinfo>

我有一个由上述格式的工具生成的 .XML 文件,其中 CDATA 部分包含 html 数据。哪个解析器或以什么方式可以使用 java 从 XML 文件中检索 html 数据?

最佳答案

只需以文本内容形式访问 CDATA

变体 1 (DOM):

    import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public void getCDATAFromHardcodedPathWithDom() {
String yourSampleFile = "/path/toYour/sample/file.xml";
String cdataNode = "extendedinfo";
try (InputStream in =
new BufferedInputStream(new FileInputStream(yourSampleFile))) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(in);
NodeList elements = doc.getElementsByTagName(cdataNode);
for (int i = 0; i < elements.getLength(); i++) {
Node e = elements.item(i);
System.out.println(e.getTextContent());
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}

变体 2(stax):

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;

public void getCDATAFromHardcodedPathWithStax() {
String yourSampleFile = "/path/toYour/sample/file.xml";
String cdataNode = "extendedinfo";
XMLStreamReader r = null;
try (InputStream in =
new BufferedInputStream(new FileInputStream(yourSampleFile));) {
XMLInputFactory factory = XMLInputFactory.newInstance();
r = factory.createXMLStreamReader(in);
while (r.hasNext()) {
switch (r.getEventType()) {
case XMLStreamConstants.START_ELEMENT:
if (cdataNode.equals(r.getName().getLocalPart())) {
System.out.println(r.getElementText());
}
break;
default:
break;
}
r.next();
}
} catch (Exception e) {
throw new RuntimeException(e);
} finally {
if (r != null) {
try {
r.close();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
}

使用/path/toYour/sample/file.xml

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<root>
<extendedinfo type="html">
<![CDATA[<table class="ResultTable" cellpadding=2 cellspacing=1 border=0><tr class="TableHeadingLine"><th bgcolor="#b3b3b3" align="left" colspan="6"><font face="arial, verdana, trebuchet, officina, sans-serif" size="+2"><B>Testcase: Init Testreport</B></font></th></tr><tr class="TableHeadingLine"><th class="TableHeadingCell" width="120px"></th><th class="TableHeadingCell" width="120px"></th><th class="TableHeadingCell" width="80px"></th><th class="TableHeadingCell" width="345px"></th><th class="TableHeadingCell" width="345px"></th><th class="TableHeadingCell" width="70px"></th></tr>]]>
</extendedinfo>
<extendedinfo type="html">
<![CDATA[<tr><td class="DefineCell">58.675124</td><td class="DefaultCell" colspan="5"><i><font color="#008000">Set_Temperature is set to 23</font></i><br>Set_Temperature = 23</td></tr>]]>
</extendedinfo>
</root>

它会给你

<table class="ResultTable" cellpadding=2 cellspacing=1 border=0><tr class="TableHeadingLine"><th bgcolor="#b3b3b3" align="left" colspan="6"><font face="arial, verdana, trebuchet, officina, sans-serif" size="+2"><B>Testcase: Init Testreport</B></font></th></tr><tr class="TableHeadingLine"><th class="TableHeadingCell" width="120px"></th><th class="TableHeadingCell" width="120px"></th><th class="TableHeadingCell" width="80px"></th><th class="TableHeadingCell" width="345px"></th><th class="TableHeadingCell" width="345px"></th><th class="TableHeadingCell" width="70px"></th></tr>


<tr><td class="DefineCell">58.675124</td><td class="DefaultCell" colspan="5"><i><font color="#008000">Set_Temperature is set to 23</font></i><br>Set_Temperature = 23</td></tr>

此处给出了使用 JAXB 的有趣替代方案:

Retrieve value from CDATA

此处给出了如何提取所有 CDATA 的示例:

Unable to check CDATA in XML using XMLEventReader in Stax

关于java - 如何解析 CDATA 部分中带有 HTML 标签的 XML 文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42802202/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com