Spring Data Elasticsearch没有给出预期的结果

问题描述

我正在使用Spring数据elasticsearch在我的弹性文档中进行查询。 我的Elasticsearch实体类

//all the annotation things i.e lombok,de/serializer etc
@Document(indexName = "project",type = "project")
@EqualsAndHashCode
public class ProjectEntity extends CommonProperty implements Serializable {
    @Id
    private String id;
    private String projectName;
    private String description;
    private String parentProjectId;
    private Long projectOwner;
    private String projectOwnerName;
    private Long projectManager;
    private String projectManagerName;
    private String departmentId;
    private String status;
    private String organizationId;

    @Field(type = FieldType.nested)
    private List<ActionStatusEntity> actionStatusList= new ArrayList<>();

    @Field(type = FieldType.nested)
    private List<TeamMember> teamMemberList;

    @Field(type = FieldType.nested)
    private List<UserDefineProperty> riskList;

}

我做了其他事情,例如设置存储库,为了简洁起见。 数据搜索

    projectRepository.findByOrganizationIdAndProjectName(userEntity.getorganizationId().toString(),projectRequest.getProjectName().trim());
//userEntity.getorganizationId().toString()="28",projectName="Team Test"

Spring为上述调用生成查询

{
  "from": 0,"size": 10000,"query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "28","fields": [
              "organizationId^1.0"
            ],"type": "best_fields","default_operator": "and","max_determinized_states": 10000,"enable_position_increments": true,"fuzziness": "AUTO","fuzzy_prefix_length": 0,"fuzzy_max_expansions": 50,"phrase_slop": 0,"escape": false,"auto_generate_synonyms_phrase_query": true,"fuzzy_transpositions": true,"boost": 1
          }
        },{
          "query_string": {
            "query": "Team Test","fields": [
              "projectName^1.0"
            ],"boost": 1
          }
        }
      ],"adjust_pure_negative": true,"boost": 1
    }
  },"version": true
}

查询结果:

{
  "took" : 8,"timed_out" : false,"_shards" : {
    "total" : 1,"successful" : 1,"skipped" : 0,"Failed" : 0
  },"hits" : {
    "total" : {
      "value" : 3,"relation" : "eq"
    },"max_score" : 4.1767306,"hits" : [
      {
        "_index" : "project","_type" : "project","_id" : "215","_version" : 2,"_score" : 4.1767306,"_source" : {
          "projectName" : "team member only test","description" : "team member only test","projectOwner" : 50,"projectOwnerName" : "***","departmentId" : "team member only test","organizationId" : "28"
        }
      },{
        "_index" : "project","_id" : "408","_version" : 17,"_source" : {
         
          "projectName" : "Category & Team adding test","description" : "Category & Team adding test","projectManager" : 50,"projectManagerName" : "***","departmentId" : "cat","_id" : "452","_version" : 4,"_score" : 3.4388955,"_source" : {
         
          "projectName" : "team member not in system test","description" : "id-452","projectOwner" : 53,"projectManager" : 202,"departmentId" : "abc","organizationId" : "28",}
      }
    ]
  }
}

看看结果集,像projectName方法一样检查contains字段值!它没有检查完整的给定参数。
为什么会这样呢?该怎么解决
添加:organizationId和projectName字段设置为fieldData=true

解决方法

Spring Data Elasticsearch从方法名称派生的查询是一个Elasticsearch字符串查询,具有您所注意到的给定参数。对于这些Elasticsearch,将分析并解析术语,然后搜索具有相同顺序的这些术语的文档。

您使用“团队测试” 进行的查询有两个术语,“团队” “测试” ,并且您显示的所有文档都有这些项目名称中的术语,将它们返回。

如果您的文档中包含“团队测试” ,而这两个之间没有其他术语,则返回的分数会更高。

之所以选择此实现,是因为它是在Elasticsearch中进行搜索时通常所期望的。具有名称索引并搜索“ Harry Miller” 的图像找不到包含“ Harry B. Miller” 的文档。

您可以编写一个自定义的存储库方法,该方法可以构建满足您需求的查询,而可以使用它。或者,如果您始终想对该字段进行精确搜索,则可以将其定义为 m_currentFileName = Guid.NewGuid() + ".txt.gz"; var blockBlob = new BlockBlobClient(m_connectionString,m_containerName,GetTempFilePath()); using (var stream = await blockBlob.OpenWriteAsync(true)) using (var currentStream = new GZipStream(stream,CompressionMode.Compress)) using (var writer = new StreamWriter(currentStream)) { writer.WriteLine("Hello world!"); } 字段以防止进行分析和分析。

您可以在此存储库方法定义中使用match_phrase查询(仅在此处使用一个参数,您需要添加组织ID,但是对于这个小代码示例而言,生成的查询太复杂了):

keyword
,

我不了解Spring Data Elasticsearch,但是添加了一个工作示例,其中包含索引数据,搜索查询和JSON格式的搜索结果

索引数据:

为上述所有三个文档(有问题)建立索引,并插入第四个文档,如下所示。

{
    "projectName": "team test","description": "id-452","projectOwner": 53,"projectOwnerName": "***","projectManager": 202,"projectManagerName": "***","departmentId": "abc","organizationId": "28"
}

搜索查询:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "organizationId": 28
          }
        },{
          "multi_match": {
            "query": "Team Test","type": "phrase","fields": [
              "projectName"
            ]
          }
        }
      ]
    }
  }
}

搜索结果:

"hits": [
      {
        "_index": "stof_64151693","_type": "_doc","_id": "4","_score": 0.5003766,"_source": {
          "projectName": "team test","organizationId": "28"
        }
      }
    ]