Google Knowledge Graph Search API 奇怪的结果

问题描述

我想使用这个 API,但结果让我感到困惑。

  1. 我想使用搜索字符串“Abs”搜索 types = "Brand"、languages = "en",我得到了 2 个正确的结果和 1 个错误的结果,请。检查 KG Search API Explorer 的响应:
{
  "@context": {
    "detailedDescription": "goog:detailedDescription","goog": "http://schema.googleapis.com/","EntitySearchResult": "goog:EntitySearchResult","kg": "http://g.co/kg","resultscore": "goog:resultscore","@vocab": "http://schema.org/"
  },"@type": "ItemList","itemListElement": [
    {
      "@type": "EntitySearchResult","resultscore": 296.41555786132812,"result": {
        "@id": "kg:/m/01bnqx","@type": [
          "Brand","Thing"
        ],"name": "Absolut Vodka","detailedDescription": {
          "license": "https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License","url": "https://en.wikipedia.org/wiki/Absolut_Vodka","articleBody": "Absolut Vodka is a brand of vodka,produced near Åhus,in southern Sweden. Absolut is a part of the french group Pernod Ricard. Pernod Ricard bought Absolut for €5.63 billion in 2008 from the Swedish state. Absolut is one of the largest brands of spirits in the world and is sold in 126 countries.\n"
        },"url": "http://www.absolut.com"
      }
    },{
      "result": {
        "@id": "kg:/m/04hqw8","name": "Absolute","articleBody": "Absolute is the brand of a long-running series of compilation albums owned by the Swedish record company EVA Records. Initially,the only albums in the series were called Absolute Music,but starting in 1990 there have been other themed albums such as Absolute Dance and Absolute Rock.","url": "https://en.wikipedia.org/wiki/Absolute_(record_compilation)"
        }
      },"resultscore": 103.74134826660161,"@type": "EntitySearchResult"
    },{
      "@type": "EntitySearchResult","resultscore": 0.0041735083796083927,"result": {
        "detailedDescription": {
          "articleBody": "S-AWC is the brand name of an advanced full-time four-wheel drive system developed by Mitsubishi Motors. The technology,specifically developed for the new 2007 Lancer Evolution,the 2010 Outlander,the 2014 Outlander,the Outlander PHEV and the Eclipse Cross have an advanced version of Mitsubishi Motors' AWC system. ","url": "https://en.wikipedia.org/wiki/Mitsubishi_S-AWC","license": "https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License"
        },"name": "Mitsubishi S-AWC","@id": "kg:/m/02vtht5"
      }
    }
  ]
}

绝对伏特加和绝对伏特加的效果很好,但老实说我不明白为什么 “三菱S-AWC” 列在此结果中(结果分数如此之低)。 任何想法表示赞赏:)

  1. 我认为像在查询参数中设置的最小 resultscore 这样的功能会很棒! 我在这里没有找到这样的:Method entities.search

  2. 此外,我还没有找到有关接受作为搜索字符串的最小字符数是多少的信息(2、3、更多?)

谢谢!

解决方法

Google Entity Search API 输出所有语言的全文搜索结果。 “languages”参数不影响搜索,只影响输出。

具体来说,搜索“ABS”时会得到“Mitsubishi S-AWC”,因为中文维基百科中的相关中文文章在摘要中包含标记ABS[1]。

例如,您可以用中文搜索“S-AWC 是品牌名称”并获得中文维基百科的链接[2],即使中文文章不包含这些词。

这里的分数是某种 BM25 变异[3]。您可以随意过滤它(例如,取第一个结果),但在您的示例中,响应是正确的。

[1] https://zh.wikipedia.org/zh-cn/S-AWC%E8%B6%85%E8%83%BD%E5%85%A8%E6%99%82%E5%9B%9B%E8%BC%AA%E6%8E%A7%E5%88%B6%E7%B3%BB%E7%B5%B1

[2] https://angryloki.github.io/mreid-resolver/#/search?lang=zh&q=S-AWC%20is%20the%20brand%20name&type=Brand

[3] https://en.wikipedia.org/wiki/Okapi_BM25