按数组中的出现次数进行计数和排序

问题描述

我有一个名为 account 的类型，具有以下映射：

        "country" : {
          "type" : "text","fields" : {
            "keyword" : {
              "type" : "keyword","ignore_above" : 256
            }
          }
        },"followingClientIds" : {
          "type" : "text","ignore_above" : 256
            }
          },"fielddata" : true
        },

followingClientIds是我遵循的其他帐户的字符串ID数组。

我想构建一个查询，以获取来自某个国家/地区的每个帐户，然后按照我们都遵循的共同帐户的数量对它们进行排序。

以下是我到目前为止所做的一些查询：


GET account/_search
{
  "size": 20,"query": {
    "bool": {
      "filter": {
        "term": {
          "country.keyword": "AT"
        }
      }
    }
  },"sort": [
    {
      "followingClientIds.keyword": {
        "order": "asc","nested_filter": {
          "terms": {
            "followingClientIds.keyword": [
              "dFbEW23hVZ3w8jhH9LeCw3QG33UjuF5C"
            ]
          }
        }
      }
    }
  ]
}

例如，我有这3个文档的帐户类型：

{
    "username": "user2","country": "AT","followingClientIds": ["abc"]
},{
    "username": "user3","followingClientIds": ["abc","bcd","cde"]
},{
    "username": "user4","followingClientIds": ["abc"]
}

想象一下，我将发送给查询的国家和 followingClientIds 进行排序：

{
    "country": "AT","cde"]
}

我希望结果像这样：

{
    "username": "user3","cde"],"fields": [ // dont really need this custom field,but would be cool
        "mutual_following_count": 3
    ]
},{
    "username": "user2","followingClientIds": ["abc"],"fields": [
        "mutual_following_count": 1
    ]
},"fields": [
        "mutual_following_count": 1
    ]
}

解决方法

如果您正在寻找名为mutual_following_count的独立computed field，则可以使用下面的脚本来完成。但是你won't be able to sort on it。

唯一的另一种选择是脚本排序，它首先计算一个值，然后按该值排序。结果查询如下所示：

{
  "size": 20,"query": {
    "bool": {
      "filter": {
        "term": {
          "country.keyword": "AT"
        }
      }
    }
  },"sort": [
    {
      "_script": {
        "type": "number","order": "desc","script": {
          "lang": "painless","params": {
            "followingClientIds": ["abc","bcd","cde"]
          },"source": """
            // deduplicate
            def fromSource = doc.followingClientIds
                                .stream()
                                .distinct()
                                .collect(Collectors.toList());
            def fromParams = params.followingClientIds
                                   .stream()
                                   .distinct()
                                   .collect(Collectors.toList());
            
            // size() is a float so cast
            return (int) fromParams.findAll(x -> fromSource.contains(x)).size();
          """
        }
      }
    }
  ]
}

缺点是您不能“命名”这种排序。 mutual_following_count和其他任何东西。

elasticsearch elasticsearch querydsl

按数组中的出现次数进行计数和排序

问题描述

解决方法

相关问答