gpt4 book ai didi

java - 将 XML 中的元素分成单独的文件

转载 作者:行者123 更新时间:2023-12-02 03:25:25 25 4
gpt4 key购买 nike

我有一个包含许多嵌套主题元素的 XML 文件。例如:

<?xml version="1.0" encoding="UTF-8"?>
<topic id="topic-1">
<title>ADBT</title>

<para>The program executes a database request by using the ADBT
library. The ADBT library prepares
the request and calls an ODBC driver
or a native API.  
</para>

<topic id="topic_wom_eqy_ev">
<title>Establishing a connection</title>
<para>
In order to use a database with ADBT, the first step to be taken
is
to establish a
connection.
</para>

</topic>
<topic id="topic_dsw_gqy_ev">
<title>Querying a database</title>
<para>Querying a database involves a number of stages.</para>
<topic id="topic_ljf_isy_ev">
<title>Stage one: create a query</title>
<para> A new query (ADBT_Select object) can only be created starting
from a previously
established connection. A query is created using
the CreateSelect method in two
different
ways:
</para>
</topic>
</topic>

</topic>

我希望将每个主题分成一个新的 XML 文件,文件名与标题相同。如果一个主题包含另一个主题,则该主题将是一个单独的文件,并且父主题将是一个单独的文件,其内容不包括子主题。例如,在本例中,将有四个文件作为输出,内容如下:

第 1 号:

<topic id="topic-1">
<title>ADBT</title>

<para>The program executes a database request by using the ADBT
library. The ADBT library prepares
the request and calls an ODBC driver or a native API.  
</para>
</topic>

数字 2:

<topic id="topic_wom_eqy_ev">
<title>Establishing a connection</title>
<para>
In order to use a database with ADBT, the first step to be taken is
to establish a
connection.
</para>

</topic>

第三号:

<topic id="topic_dsw_gqy_ev">
<title>Querying a database</title>
<para>Querying a database involves a number of stages.</para>
</topic>

第四号:

<topic id="topic_ljf_isy_ev">
<title>Stage one: create a query</title>
<para> A new query (ADBT_Select object) can only be created starting
from a previously
established connection. A query is created using the CreateSelect method in two
different
ways:
</para>
</topic>

我编写了一些函数,但我无法弄清楚如何分离多级嵌套主题。

最佳答案

基本上,您想要做的是:

  • 使用您选择的 XML 阅读器读取 XML
  • 获取所有<topic>递归地遍历文档中的元素
  • 对于每个 <topic>元素,创建该元素的副本(可能为每个元素创建一个新文档,其根为 <topic> 元素),复制原始元素中的所有子元素,但 tagName = topic 的子元素除外。这保证了递归调用不会产生重叠元素
  • 对于每个创建的 Document s,使用您选择的 XML 编写器将其序列化为文件

因此,对于示意性代码:

Document document = readXMLDocument(...);
List<Element> topicElements = readTopicElementsRecursively(document);
List<Document> splitTopicDocuments = new ArrayList<>();
for (Element el : topicElements) {
Document doc = copyElementWithoutTopicChildren(el);
splitTopicDocuments.add(doc);
}
writeTopicDocuments(splitTopicDocuments);

关于java - 将 XML 中的元素分成单独的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38993496/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com