当至少一个对象不包含必要的字段 Elasticsearch

问题描述

我在 elasticsearch 中有文档,如果任何 attachment 不包含 uuid 或 uuid 为空,我无法理解如何应用应该返回文档的搜索脚本。弹性 5.2 版本。 文档映射

"mappings": {
    "documentType": {
        "properties": {
            "attachment": {
                "properties": {
                    "uuid": {
                        "type": "text"
                    },"path": {
                        "type": "text"
                    },"size": {
                        "type": "long"
                    }
                }
            }}}

在elasticsearch中它看起来像

{
        "_index": "documents","_type": "documentType","_id": "1","_score": 1.0,"_source": {
          "attachment": [
               {
                "uuid": "21321321","path": "../uploads/somepath","size":1231
               },{
                "path": "../uploads/somepath",]},{
        "_index": "documents","_id": "2","_source": {
          "attachment": [
               {
                "uuid": "223645641321321",{
                "uuid": "22341424321321","_id": "3","_source": {
          "attachment": [
               {
                "uuid": "22789789341321321",]}

因此,我想获取带有 _id 1 和 3 的附件。但结果我收到了脚本错误 我尝试应用下一个脚本:

{
    "query": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "attachment"
                    }
                },{
                    "script": {
                        "script": {
                            "inline": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}","lang": "painless"
                        }
                    }
                }
            ]
        }
    }
}

接下来是错误

 "root_cause": [
            {
                "type": "script_exception","reason": "runtime error","script_stack": [
                    "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:77)","org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:36)","for (item in doc['attachment'].value) { ","                 ^---- HERE"
                ],"script": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}","lang": "painless"
            }
        ],

如果一个附件对象不包含 uuid,是否可以选择文档?

解决方法

迭代对象数组并不像人们想象的那么简单。我写了很多关于它的文章 herehere

由于您的 attachments 未定义为 nested,因此 ES 在内部将它们表示为扁平的值列表(也称为“文档值”)。例如,文档#2 中的 attachment.uuid 将变为 ["223645641321321","22341424321321"],而 attachments.size 将变为 [1231,1231]

这意味着您可以简单地比较这些扁平化表示的.length!我假设 attachment.size始终存在,因此可以作为比较基准。

还有一件事。要利用这些优化的文本字段文档值,它会require one small mapping change

PUT documents/documentType/_mappings
{
  "properties": {
    "attachment": {
      "properties": {
        "uuid": {
          "type": "text","fielddata": true     <---
        },"path": {
          "type": "text"
        },"size": {
          "type": "long"
        }
      }
    }
  }
}

完成后,您重新索引了您的文档 - 这可以通过这个小 Update by query trick 完成:

POST documents/_update_by_query

然后您可以使用以下脚本查询:

POST documents/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "attachment"
          }
        },{
          "script": {
            "script": {
              "inline": "def size_field_length = doc['attachment.size'].length; def uuid_field_length =  doc['attachment.uuid'].length; return uuid_field_length < size_field_length","lang": "painless"
            }
          }
        }
      ]
    }
  }
}
,

只是为了补充this answer。如果 uuid 字段的映射是自动创建的,则弹性搜索以这种方式添加它:

"uuid": {
    "type": "text","fields": {
        "keyword": {
            "type": "keyword","ignore_above": 256
        }
    }
}

然后脚本可能看起来像:

POST documents/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "attachment"
                    }
                },{
                    "script": {
                        "script": {
                            "inline": "doc['attachment.size'].length > doc['attachment.uuid.keyword'].length","lang": "painless"
                        }
                    }
                }
            ]
        }
    }
}