java - 连接两个 org.w3c.dom.Document-6ren

java - 连接两个 org.w3c.dom.Document

转载作者：行者123 更新时间：2023-12-01 22:37:31

我想连接两个 org.w3c.dom.Document ，我有这样的东西:

Document finalDocument = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument()
Document document1 = createDocumentOne();
Document document2 = createDocumentTwo();

// This didn't work
changeFileDocument.appendChild(document1);
changeFileDocument.appendChild(document2);

document1和document2的格式是这样的:

<headerTag>
    <tag1>value</tag1>  
</headerTag>

最后我想要的是这样的文档:

<headerTag>
    <tag1>valueForDocument1</tag1>  
</headerTag>
<headerTag>
    <tag1>valueForDocument2</tag1>  
</headerTag>

我认为你不能这样做，因为他们应该有一个共同的 parent 。如果是这样，我想创建那个“假”父级，连接文件，但然后只恢复元素列表 headerTag

我该怎么做？

最佳答案

创建新文档、解析各个部分并将其节点添加到新文档中，您的方向是正确的。

您的方法失败了，因为您尝试将整个文档附加到另一个文档，这是不可能的。

你可以尝试这样的事情:

public org.w3c.dom.Document concatXmlDocuments(String rootElementName, InputStream... xmlInputStreams) throws ParserConfigurationException, SAXException, IOException {
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    org.w3c.dom.Document result = builder.newDocument();
    org.w3c.dom.Element rootElement = result.createElement(rootElementName);
    result.appendChild(rootElement);
    for(InputStream is : xmlInputStreams) {
        org.w3c.dom.Document document = builder.parse(is);
        org.w3c.dom.Element root = document.getDocumentElement();
        NodeList childNodes = root.getChildNodes();
        for(int i = 0; i < childNodes.getLength(); i++) {
            Node importNode = result.importNode(childNodes.item(i), true);
            rootElement.appendChild(importNode);
        }
    }
    return result;
}

上面的代码复制在每个文档的根元素下找到的所有节点。当然，您可以选择仅选择性地复制您感兴趣的节点。生成的文档将反射(reflect)两个文档中的所有节点。

测试

@Test
public void concatXmlDocuments() throws ParserConfigurationException, SAXException, IOException, TransformerException {
    try (
            InputStream doc1 = new ByteArrayInputStream((
                "<headerTag>\r\n" + 
                "    <tag1>doc1 value</tag1>\r\n" + 
                "</headerTag>").getBytes(StandardCharsets.UTF_8));
            InputStream doc2 = new ByteArrayInputStream((
                "<headerTag>\r\n" + 
                "    <tag1>doc2 value</tag1>\r\n" + 
                "</headerTag>").getBytes(StandardCharsets.UTF_8));
            ByteArrayOutputStream docR = new ByteArrayOutputStream();

        ) {

        org.w3c.dom.Document result = concatXmlDocuments("headerTag", doc1, doc2);
        TransformerFactory trf = TransformerFactory.newInstance();
        Transformer tr = trf.newTransformer();
        tr.setOutputProperty(OutputKeys.INDENT, "yes");
        DOMSource source = new DOMSource(result);
        StreamResult sr = new StreamResult(docR);
        tr.transform(source, sr);
        System.out.print(new String(docR.toByteArray(), StandardCharsets.UTF_8));
    }
}

输出

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<headerTag>
    <tag1>doc1 value</tag1>
    <tag1>doc2 value</tag1>
</headerTag>

编辑

I would like to create that "fake" parent, concatenate the files, but then only recover the List of elements headerTag

正如您所说，创建一个假父级。以下是您可以如何做到的:

1) 进行串联

public org.w3c.dom.Document concatXmlDocuments(InputStream... xmlInputStreams) throws ParserConfigurationException, SAXException, IOException {
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    org.w3c.dom.Document result = builder.newDocument();
    org.w3c.dom.Element rootElement = result.createElement("fake");
    result.appendChild(rootElement);
    for(InputStream is : xmlInputStreams) {
        org.w3c.dom.Document document = builder.parse(is);
        org.w3c.dom.Element subRoot = document.getDocumentElement();
        Node importNode = result.importNode(subRoot, true);
        rootElement.appendChild(importNode);
    }
    return result;
}

2)恢复headerTag的节点列表

public NodeList recoverTheListOfElementsHeaderTag(String xml) throws ParserConfigurationException, SAXException, IOException {
    NodeList listOfElementsHeaderTag = null;
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    try (InputStream is = new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8))) {
        listOfElementsHeaderTag = recoverTheListOfElementsHeaderTag(builder.parse(is));
    }
    return listOfElementsHeaderTag;
}

public NodeList recoverTheListOfElementsHeaderTag(org.w3c.dom.Document doc) {
    org.w3c.dom.Element root = doc.getDocumentElement();
    return root.getChildNodes();
}

测试

@Test
public void concatXmlDocuments() throws ParserConfigurationException, SAXException, IOException, TransformerException {
    try (
            InputStream doc1 = new ByteArrayInputStream((
                "<headerTag>" + 
                "<tag1>doc1 value</tag1>" + 
                "</headerTag>").getBytes(StandardCharsets.UTF_8));
            InputStream doc2 = new ByteArrayInputStream((
                "<headerTag>" + 
                "<tag1>doc2 value</tag1>" + 
                "</headerTag>").getBytes(StandardCharsets.UTF_8));

        ) {

        org.w3c.dom.Document result = concatXmlDocuments(doc1, doc2);
        String resultXML = toXML(result);
        System.out.printf("%s%n", resultXML);
        NodeList listOfElementsHeaderTag = null;
        System.out.printf("===================================================%n");
        listOfElementsHeaderTag = recoverTheListOfElementsHeaderTag(resultXML);
        printNodeList(listOfElementsHeaderTag);
        System.out.printf("===================================================%n");
        listOfElementsHeaderTag = recoverTheListOfElementsHeaderTag(result);
        printNodeList(listOfElementsHeaderTag);
    }
}


private String toXML(org.w3c.dom.Document result) throws TransformerFactoryConfigurationError, TransformerConfigurationException, TransformerException, IOException {
    String resultXML = null;
    try (ByteArrayOutputStream docR = new ByteArrayOutputStream()) {
        TransformerFactory trf = TransformerFactory.newInstance();
        Transformer tr = trf.newTransformer();
        DOMSource source = new DOMSource(result);
        StreamResult sr = new StreamResult(docR);
        tr.transform(source, sr);
        resultXML = new String(docR.toByteArray(), StandardCharsets.UTF_8);
    }
    return resultXML;
}

private void printNodeList(NodeList nodeList) {
    for(int i = 0; i < nodeList.getLength(); i++) {
        printNode(nodeList.item(i), "");
    }
}

private void printNode(Node node, String startIndent) {
    if(node != null) {
        System.out.printf("%s%s%n", startIndent, node.toString());
        NodeList childNodes = node.getChildNodes();
        for(int i = 0; i < childNodes.getLength(); i++) {
            printNode(childNodes.item(i), startIndent+ "    ");
        }
    }
}

输出

<?xml version="1.0" encoding="UTF-8" standalone="no"?><fake><headerTag><tag1>doc1 value</tag1></headerTag><headerTag><tag1>doc2 value</tag1></headerTag></fake>
===================================================
[headerTag: null]
    [tag1: null]
        [#text: doc1 value]
[headerTag: null]
    [tag1: null]
        [#text: doc2 value]
===================================================
[headerTag: null]
    [tag1: null]
        [#text: doc1 value]
[headerTag: null]
    [tag1: null]
        [#text: doc2 value]

关于java - 连接两个 org.w3c.dom.Document，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26681471/

文章推荐： java - Final 不允许我更改属性

文章推荐： go - 我可以模拟一个需要使用的带有指针参数的函数吗

文章推荐： go - 查询的命令行参数

文章推荐： java - 如何访问源自数据库的 ArrayList 内的 HashMap ？

schema.org - Schema.org、Goodrelations-vocabulary.org 和 Productontology.org 之间有什么关系？
Schema.org、Goodrelations-vocabulary.org 和 Productontology.org 之间有什么关系？ Schema.org 告知，“W3C schema.org
java - 为什么 org.ietf、org.omg、org.w3c 和 org.xml 是 POJO 的一部分？
大家好，我想知道包 org.ietf、org.omg、org.w3c 和 org 是如何实现的.xml 已进入 "official" Java classes ？例如，默认 JDK 不会包含 Apa
schema.org - DBpedia.org 本体与 Schema.org 本体
首先，我试图用来自 Schema.org 的属性定义数据库表，例如，例如，我有一个名为“JobPosting”的表，它或多或少具有与 http://schema.org/JobPosting 中定义的
java - 通过 org.w3c.dom.Element 对象作为 org.dom4j.Document 上的参数查找(将 org.w3c.dom.Element 转换为 org.dom4j.Element)
我有一个 org.w3c.dom.Document 被 org.dom4j.io.DOMReader 解析。我想通过 org.w3c.dom.Element 搜索 dom4j DOM 文档。比方说
java - 无法解析 - org.dom4j.DocumentException : org. dom4j.DocumentFactory 无法转换为 org.dom4j.DocumentFactory
我正在将我的应用程序部署到 Tomcat 6.0.20。应用程序使用 Hibernate 作为 Web 层的 ORM、Spring 和 JSF。我还从 main() 方法制作了简单的运行器来测试
deployment - 由 : org. dom4j.DocumentException 引起 : org. dom4j.DocumentFactory 无法转换为 org.dom4j.DocumentFactory
我有一个使用 hibernate > 4 的 gradle 项目。如果我在 Apache tomcat 中运行我的 war 文件，我不会收到任何错误。但是当我在 Wildfly 8.2 中部署它时，出
Android Studio : Could not find org. jacoco :org. jacoco.agent :org. gradle.testing.jacoco.plugins.JacocoPluginExtension_Decorated
我正在尝试将 JaCoCo 添加到我的 Android 以覆盖 Sonar Qube。但是在运行命令 ./gradlew jacocoTestReport 时，我收到以下错误。 Task :app:
org-mode - 在 org 模式下格式化日期
如何在 emacs 组织模式中格式化日期？例如，在下表中，我希望日期显示为“Aug 29”或“Wed, Aug 29”而不是“” #+ATTR_HTML: border="2" rules="all
org-mode - 在 org 文件中包含代码片段
我想使用 org 模式来写一本技术书籍。我正在寻找一种将外部文件中的现有代码插入到 babel 代码块中的方法，该代码块在导出为 pdf 时会提供很好的格式。例如 #+BEGIN_SRC pytho
schema.org - schema.org 中的产品类别？
用作引用:https://support.google.com/webmasters/answer/146750?hl=en 您会注意到在“产品”下有一个属性类别，此外页面下方还有一个示例: Too
schema.org - Schema.org 中的产品列表
我读了这个Google doc .它说我们不使用列表中的产品。那么对于产品列表(具有多页的类似产品的类别，如“鞋子”)，推荐使用哪种模式？我用这个: { "@context": "htt
schema.org - schema.org 数据集和维基数据之间是否存在映射？
我目前在做DBpedia数据集，想通过wikidata实现schema.org和DBpedia的映射。因此我想知道 schema.org 和 wikidata 之间是否存在任何映射。最佳答案我认为
org-mode - org-mode 表内的代码块
我爱org-tables ，我用它们来记录各种事情。我现在正在为 Nix 记录一些单行代码(在阅读了 Domen Kožar 的 excellent guide 后，在 this year's Eur
schema.org - schema.org 中的多个作者或贡献者
如果看一下 Movie在 schema.org 中输入，actor 和 actors 属性都是允许的(actor 取代 actors)。但是 author 和 contributor 属性没有等效项。
schema.org - Schema.org 中的多家餐厅
我们有一些餐厅有多个地点或分支机构。我想包含正确的 Schema.org 标记，但找不到任何允许列出多个餐厅的内容。每家餐厅都有自己的地址、电子邮件、电话和营业时间，甚至可能是“分店名称”。两个分
schema.org - Schema.org 的多个综合评级
我在一个页面中有多个综合评分片段。有没有办法让其中之一成为默认值？将显示在搜索引擎结果中的那个？谢谢大家! 更新:该网页本质上是品牌的页面。它包含品牌评论的总评分及其产品列表(每个产品的总评分)。
java - org.apache.maven.archiver.MavenArchiver.getManifest(org.apache.maven.project.MavenProject，org.apache.maven.archiver.MavenArchiveConfiguration)
我提到了一些相关的职位，但并没有解决我的问题。因为我正在使用maven-jar-plugin-2.4 jar。我正在使用JBoss Developer Studio 7.1.1 GA IDE，并且正
schema.org - 个人网站是否应该将根页面标记为 schema.org 'Person' ？
网站的根页面(即 http://example.com/ )的特殊之处在于它是默认的着陆页。它可能包含许多不同的对象类型。它可能被认为是一个网站，或者一个博客等... 但它是否也应该被标记为给定对象
org-mode - 如何隐藏一些文本不被 org-publish-* 函数发布？
我想将一些文本放入一个 org 文件中，当我将内容导出到其中一种目标类型(在本例中为 HTML)时，该文件不会发布。有什么方法可以实现这个目标吗？最佳答案您可能想要使用 :noexport: 标签
org-mode - 在 org-mode 的编号列表中的步骤之间移动
org-mode 是否有一个键绑定(bind)可以在编号/项目符号列表项之间移动，就像您可以对标题一样？喜欢的功能: org-forward-heading-same-level 大纲下一个可见标题

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

java - 连接两个 org.w3c.dom.Document