gpt4 book ai didi

Java 8 - 使用 Stax 分割巨大的 XML 文件给出了意想不到的结果

转载 作者:行者123 更新时间:2023-12-02 06:15:46 27 4
gpt4 key购买 nike

当分割一个巨大的 XML 文件时,我看到了一个使用 Stax 和 Transformer.transform() 的非常好的解决方案。很好,但是我发现一些标签丢失了。这是为什么?

名称为 ... 的 XML 文件给出以下结果。在 EVENT 场合,元素标签被省略。

Element: <?xml version="1.0" encoding="UTF-8"?><car><name>car1</name></car>
Element: <?xml version="1.0" encoding="UTF-8"?><name>car2</name>
Element: <?xml version="1.0" encoding="UTF-8"?><car><name>car3</name></car>
Element: <?xml version="1.0" encoding="UTF-8"?><name>car4</name>

如何获得正确的元素?这与 transform( s, r) 干扰输入流读取有关吗?

这是我的代码(我在很多地方看到过,比如 this one )。使用 StringReader 或 FileReader 时没有变化。

我期望这样:loop { advance to start-tag;访问该元素}我看到的是:第一个:元素+第二个:元素的一部分+重复。

String testCars = "<root><car><name>car1</name></car><car><name>car2</name></car><car><name>car3</name></car><car><name>car4</name></car></root>";
String element = "car";
try {
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLStreamReader streamReader = factory.createXMLStreamReader(new StringReader(testCars));
streamReader.nextTag();
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
while(streamReader.nextTag() == XMLStreamConstants.START_ELEMENT) {
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
t.transform(new StAXSource(streamReader), result);
System.out.println("Element: " + writer.toString());
}
} catch (Exception e) { ... }

最佳答案

感谢 Andreas,这就是解决方案:

String testCars = "<root><car><name>car1</name></car><other><something>Unknown</something></other><car><name>car2</name></car></root>";
XMLInputFactory factory = XMLInputFactory.newInstance();
try {
XMLStreamReader streamReader = factory.createXMLStreamReader(new StringReader(testCars));
streamReader.nextTag();
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
streamReader.nextTag();
while ( streamReader.isStartElement() ||
( ! streamReader.hasNext() && streamReader.nextTag() == XMLStreamConstants.START_ELEMENT)) {
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
t.transform(new StAXSource(streamReader), result);
System.out.println( "XmlElement: " + writer.toString());
}
} catch (Exception e) { ... }

输入为:

<root>
<car>
<name>car1</name>
</car>
<other>
<something>Unknown</something>
</other>
<car>
<name>car2</name>
</car>
</root>

输出为:

XmlElement: <?xml version="1.0" encoding="UTF-8"?><car><name>car1</name></car>
XmlElement: <?xml version="1.0" encoding="UTF-8"?><other><something>Unknown</something></other>
XmlElement: <?xml version="1.0" encoding="UTF-8"?><car><name>car2</name></car>

关于Java 8 - 使用 Stax 分割巨大的 XML 文件给出了意想不到的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55872732/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com