UI演示和Python库中的AllenNLP阅读理解结果不同

问题描述

我正在尝试将AllenNLP reading comprehension与Transformer QA Model一起从段落"Who is CEO of ABB?"中获得问题"ABB opened its first dedicated global healthcare research center for robotics in October 2019."的答案。

如预期的那样，UI演示显示no answer returned。网络标签中的API响应也显示了这一点。在json响应中，best_span_str为空，但best_span_scores为9.9。 Screenshot of demo UI and API response in network tab.

当我通过python库执行类似的代码时，会得到不同的结果。

from allennlp.predictors.predictor import Predictor
import pandas

def allen_nlp_demo_1():
  import allennlp_models.structured_prediction
  import allennlp_models.rc
  predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/transformer-qa-2020-05-26.tar.gz")
  data = predictor.predict(
    passage="ABB opened its first dedicated global healthcare research center for robotics in October 2019.",question= "Who is CEO of ABB?"
  )
  print(data)

if __name__ == '__main__':
  allen_nlp_demo_1()

提供以下json输出

{
  "span_start_logits": [...],"best_span": [
    7,15
  ],"best_span_scores": -10.418445587158203,"loss": 0,"best_span_str": "healthcare research center for robotics in October 2019","context_tokens": [...],"id": "1","answers": []
}

在这里，我看到best_span_str出现了，而best_span_scores为-10.418445587158203。 Attaching python code and output snippet.

为什么UI演示vs库中的输出会有这种差异？另外，best_span_scores的范围是多少？在哪里可以确定丢弃错误结果的阈值？

解决方法

关于演示输出和您的运行中的差异，这是因为实际的演示使用了不同的存档文件。演示中的用法代码现已更新，以反映新的文件路径（transformer-qa-2020-10-03.tar.gz）。
为找到best_span，该模型认为对cls令牌的预测表示该问题无法回答。 best_spans表示此问题，如果问题无法回答，则为[-1，-1]。对于实际回答问题的情况，跨度得分是相对的；我们选择得分最高的跨度。因此，没有在所有情况下都可以使用的固定阈值。

allennlp nlp python

UI演示和Python库中的AllenNLP阅读理解结果不同

问题描述

解决方法

相关问答