问题描述
我想使用 spaCy v3 训练自定义 NER 模型我准备了训练数据并使用了这个脚本
import spacy
from spacy.tokens import DocBin
nlp = spacy.blank("en") # load a new spacy model
db = DocBin() # create a DocBin object
for text,annot in tqdm(TRAIN_DATA): # data in prevIoUs format
doc = nlp.make_doc(text) # create doc object from text
ents = []
for start,end,label in annot["entities"]: # add character indexes
span = doc.char_span(start,label=label)
if span is None:
pass
else:
ents.append(span)
doc.ents = ents # label the text with the ents
db.add(doc)
db.to_disk("./train.spacy") # save the docbin object
然后它会打印此错误:
AttributeError: 'DocBin' object has no attribute 'to_disk'
解决方法
确保你真的在使用 spaCy 3,以防你没有:)
您可以通过运行 python -c "import spacy; print(spacy.__version__)"
在 python 环境中通过命令行 pip install spacy==3.0.6
发出,然后在 python 控制台中运行
import spacy
from spacy.tokens import DocBin
nlp = spacy.blank("en") # load a new spacy model
db = DocBin() # create a DocBin object
# omitting code for debugging purposes
db.to_disk("./train.spacy") # save the docbin object
你应该不会出错。