gpt4 book ai didi

python - 值错误: Unexpected character found when decoding 'true' while converting IOB to JSONL for SpaCy

转载 作者:行者123 更新时间:2023-11-30 09:39:09 24 4
gpt4 key购买 nike

我想将 IOB 格式文本文件转换为 SpaCy 格式,并为每个标记指定开始和结束索引。

我运行这段代码:

python -m spacy convert test_IOB.txt out --converter jsonl --lang English

我收到错误:

ValueError: Unexpected character found when decoding 'true'

我的输入数据如下所示:

the O
r O
/ O
p O
( O
years O
) O
ratio O
of O
the O
sand O
is O
16 O
. O

chiaramonte O
, O
l. O
2008 O
, O
geomechanical O
characterization O
and O
reservoir O
simulation O
of O
a O
co O
sequestration O
project O
in O
a O
mature O
ofield O

谢谢!

最佳答案

您正在调用 --converter json 选项,但您的输入文件不是 json 格式。

您应该使用 --converter ner 作为您正在使用的输入。

ner

NER with IOB/IOB2 tags, one token per line with columns separated by whitespace. The first column is the token and the final column is the IOB tag. Sentences are separated by blank lines and documents are separated by the line -DOCSTART- -X- O O. Supports CoNLL 2003 NER format. See sample data.

关于python - 值错误: Unexpected character found when decoding 'true' while converting IOB to JSONL for SpaCy,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60032646/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com