gpt4 book ai didi

python - 如何使用 lxml 插入具有正确 namespace 前缀的属性

转载 作者:行者123 更新时间:2023-12-01 06:55:21 26 4
gpt4 key购买 nike

有没有办法,使用lxml ,插入具有正确命名空间的 XML 属性?

例如,我想使用 XLink在 XML 文档中插入链接。我需要做的就是在某些元素中插入 {http://www.w3.org/1999/xlink}href 属性。我想使用 xlink 前缀,但 lxml 会生成“ns0”、“ns1”等前缀...

这是我尝试过的:

from lxml import etree

#: Name (and namespace) of the *href* attribute use to insert links.
HREF_ATTR = etree.QName("http://www.w3.org/1999/xlink", "href").text

content = """\
<body>
<p>Link to <span>StackOverflow</span></p>
<p>Link to <span>Google</span></p>
</body>
"""

targets = ["https://stackoverflow.com", "https://www.google.fr"]
body_elem = etree.XML(content)
for span_elem, target in zip(body_elem.iter("span"), targets):
span_elem.attrib[HREF_ATTR] = target

etree.dump(body_elem)

转储看起来像这样:

<body>
<p>link to <span xmlns:ns0="http://www.w3.org/1999/xlink"
ns0:href="https://stackoverflow.com">stackoverflow</span></p>
<p>link to <span xmlns:ns1="http://www.w3.org/1999/xlink"
ns1:href="https://www.google.fr">google</span></p>
</body>

我找到了一种通过在根元素中插入和删除属性来分解命名空间的方法,如下所示:

# trick to declare the XLink namespace globally (only one time).
body_elem = etree.XML(content)
body_elem.attrib[HREF_ATTR] = ""
del body_elem.attrib[HREF_ATTR]

targets = ["https://stackoverflow.com", "https://www.google.fr"]
for span_elem, target in zip(body_elem.iter("span"), targets):
span_elem.attrib[HREF_ATTR] = target

etree.dump(body_elem)

这很丑陋,但它有效,而且我只需要做一次。我得到:

<body xmlns:ns0="http://www.w3.org/1999/xlink">
<p>Link to <span ns0:href="https://stackoverflow.com">StackOverflow</span></p>
<p>Link to <span ns0:href="https://www.google.fr">Google</span></p>
</body>

但问题仍然存在:如何将这个“ns0”前缀变成“xlink”?

最佳答案

使用register_namespace正如@mzjn建议的:

etree.register_namespace("xlink", "http://www.w3.org/1999/xlink")

# trick to declare the XLink namespace globally (only one time).
body_elem = etree.XML(content)
body_elem.attrib[HREF_ATTR] = ""
del body_elem.attrib[HREF_ATTR]

targets = ["https://stackoverflow.com", "https://www.google.fr"]
for span_elem, target in zip(body_elem.iter("span"), targets):
span_elem.attrib[HREF_ATTR] = target

etree.dump(body_elem)

结果正如我所料:

<body xmlns:xlink="http://www.w3.org/1999/xlink">
<p>Link to <span xlink:href="https://stackoverflow.com">StackOverflow</span></p>
<p>Link to <span xlink:href="https://www.google.fr">Google</span></p>
</body>

关于python - 如何使用 lxml 插入具有正确 namespace 前缀的属性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58833073/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com