gpt4 book ai didi

python - 使用 spaCy 3.0 将数据从旧的 Spacy v2 格式转换为全新的 Spacy v3 格式

转载 作者:行者123 更新时间:2023-12-04 16:37:42 29 4
gpt4 key购买 nike

我有变量 trainData它具有以下简化格式。

[

('Paragraph_A', {"entities": [(15, 26, 'DiseaseClass'), (443, 449, 'DiseaseClass'), (483, 496, 'DiseaseClass')]}),
('Paragraph_B', {"entities": [(969, 975, 'DiseaseClass'), (1257, 1271, 'SpecificDisease')]}),
('Paragraph_C', {"entities": [(0, 27, 'SpecificDisease')]})
]
我正在尝试转换 trainData.spacy通过首先在 doc 中转换它然后到 DocBin .整机 trainData文件可通过 GoogleDocs 访问.
我试图重现本教程中提到的内容,但对我不起作用。教程为: Using spaCy 3.0 to build a custom NER model

我尝试了以下方法。
import spacy
from spacy.tokens import DocBin

nlp = spacy.blank("en") # load a new spacy model
db = DocBin() # create a DocBin object

for text, annot in trainData: # data in previous format
doc = nlp.make_doc(text) # create doc object from text
ents = []
for start, end, label in annot["entities"]: # add character indexes
span = doc.char_span(start, end, label=label, alignment_mode="contract")
ents.append(span)
doc.ents = span # label the text with the ents
db.add(doc)

db.to_disk("./train.spacy") # save the docbin object
但是我在如何转换来自 Spacy v2 的数据的代码中弄错了至 Spacy v3 .
在上面的代码片段中,我得到了一个回溯: TypeError: 'spacy.tokens.token.Token' object is not iterable .

最佳答案

你有一个小错误。检查已更改线路的 XXX。

import spacy
from spacy.tokens import DocBin

nlp = spacy.blank("en") # load a new spacy model
db = DocBin() # create a DocBin object

for text, annot in trainData: # data in previous format
doc = nlp.make_doc(text) # create doc object from text
ents = []
for start, end, label in annot["entities"]: # add character indexes
span = doc.char_span(start, end, label=label, alignment_mode="contract")
ents.append(span)
#XXX FOLLOWING LINE CHANGED
doc.ents = ents # label the text with the ents
db.add(doc)

db.to_disk("./train.spacy") # save the docbin object

关于python - 使用 spaCy 3.0 将数据从旧的 Spacy v2 格式转换为全新的 Spacy v3 格式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67407433/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com