gpt4 book ai didi

java - 如何在JAVA中读取一个大的XML文件并根据标签将其分割成小的XML文件?

转载 作者:行者123 更新时间:2023-12-01 06:44:17 26 4
gpt4 key购买 nike

我是JAVA编程新手,现在我需要JAVA程序来读取包含..标签的大XML文件。输入示例如下。

输入.xml

<row>
<Name>Filename1</Name>
</row>
<row>
<Name>Filename2</Name>
</row>
<row>
<Name>Filename3</Name>
</row>
<row>
<Name>Filename4</Name>
</row>
<row>
<Name>Filename5</Name>
</row>
<row>
<Name>Filename6</Name>
</row>
.
.

我需要首先输出 <row> </row>作为单个 .xml 文件,文件名为 filename1.xml第二个 <row>..</row>如 filename2.xml 等。

谁能告诉我如何用Java以简单的方式做到这一点的步骤,如果你提供任何示例代码,这将非常有用?

最佳答案

您可以使用 StAX 执行以下操作,因为您说您的 xml 很大

适合您的用例的代码

以下代码使用 StAX API 来分解您问题中概述的文档:

 import java.io.*;
import java.util.*;

import javax.xml.namespace.QName;
import javax.xml.stream.*;
import javax.xml.stream.events.*;

public class Demo {

public static void main(String[] args) throws Exception {
Demo demo = new Demo();
demo.split("src/forum7408938/input.xml", "nickname");
//demo.split("src/forum7408938/input.xml", null);
}

private void split(String xmlResource, String condition) throws Exception {
XMLEventFactory xef = XMLEventFactory.newFactory();
XMLInputFactory xif = XMLInputFactory.newInstance();
XMLEventReader xer = xif.createXMLEventReader(new FileReader(xmlResource));
StartElement rootStartElement = xer.nextTag().asStartElement(); // Advance to statements element
StartDocument startDocument = xef.createStartDocument();
EndDocument endDocument = xef.createEndDocument();

XMLOutputFactory xof = XMLOutputFactory.newFactory();
while(xer.hasNext() && !xer.peek().isEndDocument()) {
boolean metCondition;
XMLEvent xmlEvent = xer.nextTag();
if(!xmlEvent.isStartElement()) {
break;
}
// Be able to split XML file into n parts with x split elements(from
// the dummy XML example staff is the split element).
StartElement breakStartElement = xmlEvent.asStartElement();
List<XMLEvent> cachedXMLEvents = new ArrayList<XMLEvent>();

// BOUNTY CRITERIA
// I'd like to be able to specify condition that must be in the
// split element i.e. I want only staff which have nickname, I want
// to discard those without nicknames. But be able to also split
// without conditions while running split without conditions.
if(null == condition) {
cachedXMLEvents.add(breakStartElement);
metCondition = true;
} else {
cachedXMLEvents.add(breakStartElement);
xmlEvent = xer.nextEvent();
metCondition = false;
while(!(xmlEvent.isEndElement() && xmlEvent.asEndElement().getName().equals(breakStartElement.getName()))) {
cachedXMLEvents.add(xmlEvent);
if(xmlEvent.isStartElement() && xmlEvent.asStartElement().getName().getLocalPart().equals(condition)) {
metCondition = true;
break;
}
xmlEvent = xer.nextEvent();
}
}

if(metCondition) {
// Create a file for the fragment, the name is derived from the value of the id attribute
FileWriter fileWriter = null;
fileWriter = new FileWriter("src/forum7408938/" + breakStartElement.getAttributeByName(new QName("id")).getValue() + ".xml");

// A StAX XMLEventWriter will be used to write the XML fragment
XMLEventWriter xew = xof.createXMLEventWriter(fileWriter);
xew.add(startDocument);

// BOUNTY CRITERIA
// The content of the spitted files should be wrapped in the
// root element from the original file(like in the dummy example
// company)
xew.add(rootStartElement);

// Write the XMLEvents that were cached while when we were
// checking the fragment to see if it matched our criteria.
for(XMLEvent cachedEvent : cachedXMLEvents) {
xew.add(cachedEvent);
}

// Write the XMLEvents that we still need to parse from this
// fragment
xmlEvent = xer.nextEvent();
while(xer.hasNext() && !(xmlEvent.isEndElement() && xmlEvent.asEndElement().getName().equals(breakStartElement.getName()))) {
xew.add(xmlEvent);
xmlEvent = xer.nextEvent();
}
xew.add(xmlEvent);

// Close everything we opened
xew.add(xef.createEndElement(rootStartElement.getName(), null));
xew.add(endDocument);
fileWriter.close();
}
}
}

}

关于java - 如何在JAVA中读取一个大的XML文件并根据标签将其分割成小的XML文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20143405/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com