如何在Spacy中训练具有不同光束目标参数的NER模型？

问题描述

我正在尝试使用除en_core_web_md以外的几轮光束物镜来更新预训练的伪造模型beam_width = 1，但似乎找不到正确的方法来传递不同的参数放入**cfg中，以便模型使用它们进行训练（在THIS点）。

这是我最近的尝试：

pipe_exceptions = ["ner","trf_wordpiecer","trf_tok2vec"]
other_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]
# only train NER
with nlp.disable_pipes(*other_pipes),warnings.catch_warnings():
    # show warnings for misaligned entity spans once
    warnings.filterwarnings("once",category=UserWarning,module='spacy')

    # TRY TO FORCE BEAM TRAINING INSTEAD OF GREEDY METHOD
    nlp.use_params({'ner':{'beam_width':16,'beam_density':0.0001}})
    print(nlp.Meta) 

    sizes = compounding(1.0,4.0,1.001)
    # batch up the examples using spaCy's minibatch
    for itn in range(n_iter):
        random.shuffle(TRAIN_DATA_2)
        batches = minibatch(TRAIN_DATA_2,size=sizes)
        losses = {}
        for batch in batches:
            texts,annotations = zip(*batch)
            nlp.update(texts,annotations,sgd=optimizer,drop=0.35,losses=losses
            )
        print("Losses",losses)

但是，经过训练后，model/ner/cfg文件仍然列出：

{
"beam_width":1,"beam_density":0.0,"beam_update_prob":1.0,...

所以，我有几个问题：

我可以使用新的光束物镜来更新现有的贪婪训练模型吗？
如果为true，如何正确更改这些训练参数（并确认它们已更改）？
如果为false，对于从头开始的新模型，如何正确更改这些训练参数（并确认它们已更改）？

为什么这样做？ 我正在尝试训练一个模型，该模型提供我可以向用户展示的NER决策的概率。 THIS帖子和其他一些文章展示了如何使用beam_parse从贪婪模型中获取事实之后的概率。但是，他们都提到，贪婪模型尚未针对全局目标进行训练，因此，除非您也执行波束训练的某些迭代，否则这些得分并不是特别有意义。（link to github issue）

解决方法

我在this stack post中找到了答案。这是修改配置参数的语法。

nlp.entity.cfg['beam_width'] = 16
nlp.entity.cfg['beam_density'] = 0.0001

我将这些行放在optimizer = nlp.resume_training()之前，并将这些值用于训练。

python spacy