gpt4 book ai didi

python - 使用 Python 和 lxml 针对外部 DTD 验证 XML

转载 作者:太空宇宙 更新时间:2023-11-04 03:48:42 24 4
gpt4 key购买 nike

我正在尝试根据 doctype 标记中引用的外部 DTD 验证 XML 文件。具体来说:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-export SYSTEM "http://xml.evernote.com/pub/evernote-export3.dtd">
...the rest of the document...

我正在使用 Python 3.3 和 lxml 模块。来自阅读http://lxml.de/validation.html#validation-at-parse-time ,我把它放在一起:

enexFile = open(sys.argv[2], mode="rb") # sys.argv[2] is the path to an XML file in local storage.
enexParser = etree.XMLParser(dtd_validation=True)
enexTree = etree.parse(enexFile, enexParser)

根据我对 validation.html 的理解,lxml 库现在应该负责检索 DTD 和执行验证。但是相反,我得到了这个:

$ ./mapwrangler.py validate notes.enex
Traceback (most recent call last):
File "./mapwrangler.py", line 27, in <module>
enexTree = etree.parse(enexFile, enexParser)
File "lxml.etree.pyx", line 3239, in lxml.etree.parse (src/lxml/lxml.etree.c:69955)
File "parser.pxi", line 1769, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:102257)
File "parser.pxi", line 1789, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:102516)
File "parser.pxi", line 1684, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:101442)
File "parser.pxi", line 1134, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:97069)
File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:91275)
File "parser.pxi", line 683, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:92461)
File "parser.pxi", line 622, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:91757)
lxml.etree.XMLSyntaxError: Validation failed: no DTD found !, line 3, column 43

这让我很吃惊,因为如果我关闭验证,那么文档解析得很好,我可以执行 print(enexTree.docinfo.doctype) 来获取

$ ./mapwrangler.py validate notes.enex
<!DOCTYPE en-export SYSTEM "http://xml.evernote.com/pub/evernote-export3.dtd">

所以在我看来,找到 DTD 应该没有任何问题。

感谢您的帮助。

最佳答案

构造解析器对象时需要添加no_network=False。此选项默认设置为 True

来自 http://lxml.de/parsing.html#parsers 处的解析器选项文档:

no_network - prevent network access when looking up external documents (on by default)

关于python - 使用 Python 和 lxml 针对外部 DTD 验证 XML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22392180/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com