gpt4 book ai didi

Python rdflib 无法正确解析知识共享许可信息

转载 作者:太空宇宙 更新时间:2023-11-04 01:26:30 26 4
gpt4 key购买 nike

我使用的是 rdflib 版本 3.2.3,一切正常。升级到 4.0.1 后我开始收到错误:

RDFa parsing Error! 'ascii' codec can't decode byte 0xc3 in position 5454: ordinal not in range(128)

我尝试了各种方法来完成这项工作,但到目前为止还没有成功。以下是我的尝试。

在每种情况下我:

from rdflib import Graph

第一次尝试:

>>> lg =Graph()
>>> len(lg.parse('http://creativecommons.org/licenses/by/3.0/'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/graph.py", line 1002, in parse
parser.parse(source, self, **args)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/structureddata.py", line 268, in parse
vocab_cache=vocab_cache)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/structureddata.py", line 148, in _process
_check_error(processor_graph)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/structureddata.py", line 57, in _check_error
raise Exception("RDFa parsing Error! %s" % msg)
Exception: RDFa parsing Error! 'ascii' codec can't decode byte 0xc3 in position 4801: ordinal not in range(128)

第二次尝试:

>>> lg =Graph()
>>> len(lg.parse('http://creativecommons.org/licenses/by/3.0/rdf'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/graph.py", line 1002, in parse
parser.parse(source, self, **args)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/rdfxml.py", line 570, in parse
self._parser.parse(source)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 349, in end_element_ns
self._cont_handler.endElementNS(pair, None)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/rdfxml.py", line 160, in endElementNS
self.current.end(name, qname)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/plugins/parsers/rdfxml.py", line 461, in property_element_end
current.data, literalLang, current.datatype)
File "/home/alex/Projects/RDF/rdfEnv/local/lib/python2.7/site-packages/rdflib/term.py", line 541, in __new__
raise Exception("'%s' is not a valid language tag!"%lang)
Exception: 'i18n' is not a valid language tag!

第三次尝试:没有错误但也没有给出任何结果

>>> lg =Graph()
>>> len(lg.parse('http://creativecommons.org/licenses/by/3.0/rdf', format='rdfa'))
0

所以有人请告诉我我哪里错了! :)

最佳答案

正如 Graham 在 rdflib 邮件列表上回复的那样,存在一个 html5lib 问题 - 我们将在下一个版本中为 python 2 正确固定它,但现在只需执行以下操作:

pip install html5lib==0.95

第二个问题是知识共享数据中的问题,根据 rfc5646,“i18n”确实不是有效的语言标签。我添加了检查,但回想起来,引发异常似乎很严格。我想我会把它改成警告。

关于Python rdflib 无法正确解析知识共享许可信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17462385/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com