Elastic Search 查询应该返回 10.000 个结果但没有匹配项

问题描述

所以我有一个大约 60GB 数据的索引，基本上我想进行查询以根据其参考号检索 1 个特定产品。

这是我的查询：

GET myindex/_search
{
  "_source": [
    "product.ref","product.urls.*","product.i18ns.*.title","product_sale_elements.quantity","product_sale_elements.prices.*.price","product_sale_elements.listen_price.*","product.images.image_url","product.image_count","product.images.visible","product.images.position"
  ],"size": "6","from": "0","query": {
    "function_score": {
      "functions": [
        {
          "field_value_factor": {
            "field": "product.sales_count","missing": 0,"modifier": "log1p"
          }
        },{
          "field_value_factor": {
            "field": "product.image_count",{
          "field_value_factor": {
            "field": "featureCount","modifier": "log1p"
          }
        }
      ],"query": {
        "bool": {
          "filter": [
            {
              "term": {
                "product.is_visible": true
              }
            }
          ],"should": [
            {
              "query_string": {
                "default_field": "product.ref","query": "13141000","boost": 2
              }
            }
          ]
        }
      }
    }
  },"aggs": {
    "by_categories": {
      "terms": {
        "field": "categories.i18ns.de_DE.title.raw","size": 100
      }
    }
  }
}

因此，我的问题是，为什么此查询会返回 10k 结果，而我只想要具有该参考编号的 1 个单一产品。

如果我这样做：

GET my-index/_search
{
  "query": {
    "match": {
      "product.ref": "13141000"
    }
  }
}

它正确匹配。 should 与普通的 match 查询有何不同？

解决方法

如果您有 must 或 filter 子句，那么除了匹配 must 或 filter 之外的任何内容都不必匹配您的 should 子句，因为它被认为是“可选的”

您可以将 query_string 子句中的 should 移动到 filter 或像这样将 minimum_should_match 设置为 1

...
"should": [
  {
    "query_string": {
      "default_field": "product.ref","query": "13141000","boost": 2
    }
  }
],"minimum_should_match" : 1,...

必须 - 条件必须匹配。

应该 - 如果条件匹配，那么它将在非过滤器上下文中提高分数。（如果 minimum_should_match 没有明确声明）

如您所见，must 与 filter 类似，但也提供评分。过滤器不会提供任何评分。

您可以将此子句放入新的 must 子句中：

use nom::IResult;
use nom::branch::alt;
use nom::bytes::complete::{tag,take_while};
use nom::sequence::{terminated,delimited,pair};
use nom::multi::{separated_list0,many1};

#[derive(Debug)]
struct Entry {
    title: String,body: String,}

fn main() {
    let input = r#"title1
title1 line1
title1 line2
sep/
title2
title2 line1
title2 line2
title2 line3
sep/
title3
title3 line1
sep/"#;

    let (_,entries) = parse(input).unwrap();
    println!("{:#?}",entries);
}

fn parse(input: &str) -> IResult<&str,Vec<Entry>> {
    separated_list0(
        separator,entry,)(input)
}

fn entry(input: &str) -> IResult<&str,Entry> {
    let (input,title) = title(input)?;
    let (input,body_lines) = many1(body_line(title))(input)?;

    let body = body_lines.join("");
    let entry = Entry {
        title: title.to_owned(),body,};
    
    //TODO: Does it have to end with a separator ? 
    // If it does,either use terminated() in combination with many(),or add
    // an additional check for separator here
    

    IResult::Ok((input,entry))
}

fn title(input: &str) -> IResult<&str,&str> {
    terminated(
        take_while(not_r_n),end_of_line,)(input)
}

pub fn body_line<'i>(title: &'i str) -> impl FnMut(&'i str) -> IResult<&'i str,&'i str,nom::error::Error<&'i str>>
{
    move |input: &str| {
        delimited(
            pair(tag(title),tag(" ")),take_while(not_r_n),)(input)
    }
}

fn separator(input: &str) -> IResult<&str,&str> {
    terminated(
        tag("sep/"),// the separator is hardcoded,otherwise you have to do the same monstrosity as body_line() above
        end_of_line,)(input)
}

fn end_of_line(input: &str) -> IResult<&str,&str> {
    alt((
        tag("\n"),tag("\r\n")
    ))(input)
}

fn not_r_n(ch: char) -> bool {
    ch != '\r' && ch != '\n'
}

如果将上述内容放在过滤器子句中，Boost 不会影响评分。

阅读有关布尔查询的更多信息 here

booleanquery elasticsearch elasticsearch lucene lucene matching