当每个字段都在 Elasticsearch 的列表中时,如何执行字段的精确匹配

问题描述

存在于我的 ES 索引中的数据示例:

{
  "entities" : [
    {
      "fieldName" : "abc"
    },{
      "fieldName" : "def"
    }
  ],"entities" : [
    {
      "fieldName" : "abc"
    },{
      "fieldName" : "def"
    },{
      "fieldName" : "gh"
    }
  ]
}

我只想找到那些 fieldName 只匹配 "abc" 和 "def" 的文档,所以我尝试了 ES 的嵌套匹配查询,但问题是它也匹配除了 "abc" 之外具有额外字段的文档" 和 "def"。

GET fetch_latest_version/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "entities","query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "entities.fieldName": "abc"
                    }
                  }
                ]
              }
            }
          }
        },{
          "nested": {
            "path": "entities","query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "entities.fieldName": "def"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }

上述查询的结果是它将列出样本数据中存在的所有 3 个文档。我只希望它匹配第一个和第二个文档(如完全匹配),其中实体列表只有“abc”和“def”字段。我不应该匹配 fieldName 为 ("abc","def","gh") 的第三个文档。

解决方法

您可以添加一个 must_not 查询,以删除所有具有 "entities.fieldName":"gh" 的文档。修改您的搜索查询,如下所示

    {
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "entities","query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "entities.fieldName": "abc"
                    }
                  }
                ]
              }
            }
          }
        },{
          "nested": {
            "path": "entities","query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "entities.fieldName": "def"
                    }
                  }
                ]
              }
            }
          }
        }
      ],"must_not": {
        "nested": {
          "path": "entities","query": {
            "bool": {
              "should": [
                {
                  "match": {
                    "entities.fieldName": "gh"
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
}

搜索结果将是

"hits": [
      {
        "_index": "66381155","_type": "_doc","_id": "1","_score": 0.8266786,"_source": {
          "entities": [
            {
              "fieldName": "abc"
            },{
              "fieldName": "def"
            }
          ]
        }
      },{
        "_index": "66381155","_id": "2",{
              "fieldName": "def"
            }
          ]
        }
      }
    ]

更新 1:

AFAIK 没有办法告诉 elasticsearch 确保每个文档都必须有 "fieldName":"abc""fieldName":"def"。 一种方法是在索引文档之前检查这种情况。您可以创建一个包含该值的顶级非嵌套字段。

,

我认为没有精确的子匹配查询,因此我建议创建一个以组合为键的单个字段:即:“abcdef”,然后对该字段进行精确搜索。

如果您需要在不接触数据的情况下解决它,那么您可以使用脚本查询:

GET /fetch_latest_version/_search
{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },"filter": {
        "bool": {
          "must": {
            "script": {
              "script": {
                "lang": "painless","source": "def x = doc['entities.fieldName.keyword']; x.containsAll(params.filter) && params.filter.length == x.length","params": {
                  "filter": [
                    "def","abc"
                  ]
                }
              }
            }
          }
        }
      }
    }
  }
}

我必须删除嵌套字段类型才能轻松工作

我首先验证文档中是否存在“abc”和“def”,然后验证过滤器的长度和子项是否相同。

{
  "took" : 0,"timed_out" : false,"_shards" : {
    "total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0
  },"hits" : {
    "total" : {
      "value" : 2,"relation" : "eq"
    },"max_score" : 1.0,"hits" : [
      {
        "_index" : "fetch_latest_version","_type" : "_doc","_id" : "xf733XcBhQfuY9se1B5L","_score" : 1.0,"_source" : {
          "entities" : [
            {
              "fieldName" : "abc"
            },{
              "fieldName" : "def"
            }
          ]
        }
      },{
        "_index" : "fetch_latest_version","_id" : "xv743XcBhQfuY9seFR6t",{
              "fieldName" : "def"
            }
          ]
        }
      }
    ]
  }
}

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...