gpt4 book ai didi

python - 忠实地保留已解析 XML 中的注释

转载 作者:太空狗 更新时间:2023-10-29 17:26:55 24 4
gpt4 key购买 nike

我想在操作 XML 时尽可能忠实地保留注释。

我设法保留了评论,但内容正在被 XML 转义。

#!/usr/bin/env python
# add_host_to_tomcat.py

import xml.etree.ElementTree as ET
from CommentedTreeBuilder import CommentedTreeBuilder
parser = CommentedTreeBuilder()

if __name__ == '__main__':
filename = "/opt/lucee/tomcat/conf/server.xml"

# this is the important part: use the comment-preserving parser
tree = ET.parse(filename, parser)

# get the node to add a child to
engine_node = tree.find("./Service/Engine")

# add a node: Engine.Host
host_node = ET.SubElement(
engine_node,
"Host",
name="local.mysite.com",
appBase="webapps"
)
# add a child to new node: Engine.Host.Context
ET.SubElement(
host_node,
'Context',
path="",
docBase="/path/to/doc/base"
)

tree.write('out.xml')
#!/usr/bin/env python
# CommentedTreeBuilder.py

from xml.etree import ElementTree

class CommentedTreeBuilder ( ElementTree.XMLTreeBuilder ):
def __init__ ( self, html = 0, target = None ):
ElementTree.XMLTreeBuilder.__init__( self, html, target )
self._parser.CommentHandler = self.handle_comment

def handle_comment ( self, data ):
self._target.start( ElementTree.Comment, {} )
self._target.data( data )
self._target.end( ElementTree.Comment )

但是,像这样的评论:

  <!--
EXAMPLE HOST ENTRY:
<Host name="lucee.org" appBase="webapps">
<Context path="" docBase="/var/sites/getrailo.org" />
<Alias>www.lucee.org</Alias>
<Alias>my.lucee.org</Alias>
</Host>

HOST ENTRY TEMPLATE:
<Host name="[ENTER DOMAIN NAME]" appBase="webapps">
<Context path="" docBase="[ENTER SYSTEM PATH]" />
<Alias>[ENTER DOMAIN ALIAS]</Alias>
</Host>
-->

结束为:

  <!--
EXAMPLE HOST ENTRY:
&lt;Host name="lucee.org" appBase="webapps"&gt;
&lt;Context path="" docBase="/var/sites/getrailo.org" /&gt;
&lt;Alias&gt;www.lucee.org&lt;/Alias&gt;
&lt;Alias&gt;my.lucee.org&lt;/Alias&gt;
&lt;/Host&gt;

HOST ENTRY TEMPLATE:
&lt;Host name="[ENTER DOMAIN NAME]" appBase="webapps"&gt;
&lt;Context path="" docBase="[ENTER SYSTEM PATH]" /&gt;
&lt;Alias&gt;[ENTER DOMAIN ALIAS]&lt;/Alias&gt;
&lt;/Host&gt;
-->

我还在 CommentedTreeBuilder.py 中尝试了 self._target.data( saxutils.unescape(data) ),但它似乎没有做任何事情。事实上,我认为问题发生在 handle_commment() 步骤之后的某处。

顺便说一句,这个问题类似于this .

最佳答案

经过 Python 2.7 和 3.5 测试,以下代码应按预期工作。

#!/usr/bin/env python
# CommentedTreeBuilder.py
from xml.etree import ElementTree

class CommentedTreeBuilder(ElementTree.TreeBuilder):
def comment(self, data):
self.start(ElementTree.Comment, {})
self.data(data)
self.end(ElementTree.Comment)

然后,在主代码中使用

parser = ElementTree.XMLParser(target=CommentedTreeBuilder())

作为解析器而不是当前的解析器。

顺便说一下,注释在 lxml 中开箱即用。也就是说,你可以这样做

import lxml.etree as ET
tree = ET.parse(filename)

无需上述任何内容。

关于python - 忠实地保留已解析 XML 中的注释,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33573807/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com