ElasticSearch 如何管理 ngram 查询中的分数结果?

问题描述

我的索引中有数百种化学物质结果climate_change

我正在使用 ngram 研究,这是我用于索引的设置。

{
  "settings": {
    "index.max_ngram_diff": 30,"index": {
      "analysis": {
        "analyzer": {
          "analyzer": {
            "tokenizer": "test_ngram","filter": [
              "lowercase"
            ]
          },"search_analyzer": {
            "tokenizer": "test_ngram","filter": [
              "lowercase"
            ]
          }
        },"tokenizer": {
          "test_ngram": {
            "type": "edge_ngram","min_gram": 1,"max_gram": 30,"token_chars": [
              "letter","digit"
            ]
          }
        }
      }
    }
  }
}

我的主要问题是,如果我尝试进行这样的查询

GET climate_change/_search?size=1000
{
  "query": {
    "match": {
      "description": {
        "query":"oxygen"
      }
    }
  }
}

我看到很多结果的相同的分数是 7.381186..但这很奇怪

     {
        "_index" : "climate_change","_type" : "_doc","_id" : "XXX","_score" : 7.381186,"_source" : {
          "recordtype" : "chemicals","description" : "carbon/oxygen"
        }
      },{
        "_index" : "climate_change","_id" : "YYY","description" : "oxygen"
        }

怎么可能? 在上面的例子中,如果我使用 ngram 并且我在 description 字段中搜索 oxygen,我希望第二个结果将比第一个得分更大。 我还尝试在设置中指定标记器“standard”和“whitespace”的类型,但无济于事。 也许是描述中的“/”字符?

非常感谢!

解决方法

您还需要在映射中为 2021.04.19 10:22:41 ERROR web[AXjnbvmjDDAgbPTbAAdV][o.s.s.w.WebServiceEngine] Fail to process request http://[server]:9000/sonarqube/batch/project.protobuf?key=[projectkey] java.lang.NullPointerException: null at com.google.protobuf.Internal.checkNotNull(Internal.java:64) at com.google.protobuf.MapField$MutatabilityAwareMap.putAll(MapField.java:338) at org.sonarqube.ws.Batch$WsProjectResponse$Builder.putAllFileDataByModuleAndPath(Batch.java:2244) at org.sonar.server.batch.ProjectAction.buildResponse(ProjectAction.java:108) at org.sonar.server.batch.ProjectAction.handle(ProjectAction.java:99) at org.sonar.server.ws.WebServiceEngine.execute(WebServiceEngine.java:110) at org.sonar.server.ws.WebServiceFilter.doFilter(WebServiceFilter.java:88) at org.sonar.server.platform.web.MasterServletFilter$GodFilterChain.doFilter(MasterServletFilter.java:126) at org.sonar.server.platform.web.MasterServletFilter.doFilter(MasterServletFilter.java:95) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.sonar.server.user.UserSessionFilter.doFilter(UserSessionFilter.java:87) at org.sonar.server.user.UserSessionFilter.doFilter(UserSessionFilter.java:71) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.sonar.server.platform.web.CacheControlFilter.doFilter(CacheControlFilter.java:76) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.sonar.server.platform.web.SecurityServletFilter.doHttpFilter(SecurityServletFilter.java:76) at org.sonar.server.platform.web.SecurityServletFilter.doFilter(SecurityServletFilter.java:48) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.sonar.server.platform.web.RedirectFilter.doFilter(RedirectFilter.java:58) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.sonar.server.platform.web.requestid.RequestIdFilter.doFilter(RequestIdFilter.java:63) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.sonar.server.platform.web.RootFilter.doFilter(RootFilter.java:62) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:109) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) at ch.qos.logback.access.tomcat.LogbackValve.invoke(LogbackValve.java:256) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:834) 字段定义分析器。

添加一个包含索引数据、映射、搜索查询和搜索结果的工作示例

description

索引数据:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "test_ngram","filter": [
            "lowercase"
          ]
        },"search_analyzer": {
          "tokenizer": "test_ngram","filter": [
            "lowercase"
          ]
        }
      },"tokenizer": {
        "test_ngram": {
          "type": "edge_ngram","min_gram": 1,"max_gram": 30,"token_chars": [
            "letter","digit"
          ]
        }
      }
    }
  },"mappings": {
    "properties": {
      "description": {
        "type": "text","analyzer": "my_analyzer"
      }
    }
  }
}

搜索查询:

{
  "recordtype": "chemicals","description": "carbon/oxygen"
}
{
  "recordtype": "chemicals","description": "oxygen"
}

搜索结果:

{
  "query": {
    "match": {
      "description": {
        "query":"oxygen"
      }
    }
  }
}

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...