问题描述
我很难找到弹性搜索查询的意外结果。将以下文档编入索引以进行弹性搜索。
{
"group": "J00-I99",codes: [
{ "id": "J15","description": "hello world" },{ "id": "J15.0","description": "test one world" },{ "id": "J15.1","description": "test two world J15.0" },{ "id": "J15.2","description": "test two three world J15" },{ "id": "J15.3","description": "hello world J18 " },............................ // Similar records here
{ "id": "J15.9","description": "hello world new" },{ "id": "J16.0","description": "new description" }
]
}
在这里,我的目标是实现自动完成功能,为此,我使用了n-gram方法。我不想使用完整的建议方法。
目前,我遇到两个问题:
预期结果:以上所有结果,其中包括J15 实际结果:仅获得很少的结果(J15.0,J15.1,J15.8)
预期结果:
{ "id": "J15.1",
实际结果:
{ "id": "J15.0",
然后完成映射。
{
settings: {
number_of_shards: 1,analysis: {
filter: {
ngram_filter: {
type: 'edge_ngram',min_gram: 2,max_gram: 20
}
},analyzer: {
ngram_analyzer: {
type: 'custom',tokenizer: 'standard',filter: [
'lowercase','ngram_filter'
]
}
}
}
},mappings: {
properties: {
group: {
type: 'text'
},codes: {
type: 'nested',properties: {
id: {
type: 'text',analyzer: 'ngram_analyzer',search_analyzer: 'standard'
},description: {
type: 'text',search_analyzer: 'standard'
}
}
}
}
}
}
GET myindex/_search
{
"_source": {
"excludes": [
"codes"
]
},"query": {
"nested": {
"path": "codes","query": {
"bool": {
"should": [
{
"match": {
"codes.description": "J15"
}
},{
"match": {
"codes.id": "J15"
}
}
]
}
},"inner_hits": {}
}
}
}
注意:文档索引将很大。这里仅提及示例数据。
对于第二个问题,我可以将multi_match与如下所示的AND运算符一起使用吗?
GET myindex/_search
{
"_source": {
"excludes": [
"codes"
]
},"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "J15","fields": ["codes.id","codes.description"],"operator": and
}
}
]
}
},"inner_hits": {}
}
}
}
由于我在解决此问题上遇到困难,因此我们将不胜感激。
解决方法
问题是,默认情况下,inner_hits
仅返回this official doc中提到的3个匹配文档,
大小
每个inner_hits返回的最大匹配数。 默认情况下, 返回前三个匹配项。
只需在您的inner_hits中添加size
参数即可获得所有搜索结果。
"inner_hits": {
"size": 10 // note this
}
在示例数据中进行了尝试,并看到了第一个查询的搜索结果,该查询仅返回3个搜索结果
第一个查询搜索结果
"hits": [
{
"_index": "myindexedge64170045","_type": "_doc","_id": "1","_nested": {
"field": "codes","offset": 2
},"_score": 1.8687118,"_source": {
"id": "J15.1","description": "test two world J15.0"
}
},{
"_index": "myindexedge64170045","offset": 3
},"_score": 1.7934312,"_source": {
"id": "J15.2","description": "test two three world J15"
}
},"offset": 0
},"_score": 0.29618382,"_source": {
"id": "J15","description": "hello world"
}
},"offset": 1
},"_source": {
"id": "J15.0","description": "test one world"
}
},"offset": 4
},"_source": {
"id": "J15.3","description": "hello world J18 "
}
},"offset": 5
},"_source": {
"id": "J15.9","description": "hello world new"
}
}
]
}
}
}
}
,
添加另一个答案,因为它是另一个问题,而第一个答案则集中在第一个问题上。
问题是您的第二个查询test two
返回了test one world
,并且在索引时您使用的是ngram_analyzer
,而该{使用的是标准分析器,该分析器将文本分割为白色,空格,并且您的搜索分析器再次为standard
,因此,如果在索引文档和搜索词上使用Analyze API,您将看到它与标记匹配:
{
"text" : "test one world","analyzer" : "standard"
}
并生成令牌
{
"tokens": [
{
"token": "test","start_offset": 0,"end_offset": 4,"type": "<ALPHANUM>","position": 0
},{
"token": "one","start_offset": 5,"end_offset": 8,"position": 1
},{
"token": "world","start_offset": 9,"end_offset": 14,"position": 2
}
]
}
对于您的搜索字词test two
{
"tokens": [
{
"token": "test",{
"token": "two","position": 1
}
]
}
如您所见,文档中存在test
令牌,因此您可以获得该搜索结果。可以通过在查询中使用AND运算符来解决此问题,如下所示
搜索查询
{
"_source": {
"excludes": [
"codes"
]
},"query": {
"nested": {
"path": "codes","query": {
"bool": {
"must": {
"multi_match": {
"query": "test two","fields": [
"codes.id","codes.description"
],"operator" :"AND"
}
}
}
},"inner_hits": {}
}
}
}
和搜索结果
"hits": [
{
"_index": "myindexedge64170045","_score": 2.6901608,"_score": 2.561376,"description": "test two three world J15"
}
}
]
}
}
}
}
,
添加带有索引映射,搜索查询和搜索结果的工作示例
索引映射:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram","min_gram": 2,"max_gram": 20,"token_chars": [
"letter","digit"
]
}
}
},"max_ngram_diff": 50
},"mappings": {
"properties": {
"group": {
"type": "text"
},"codes": {
"type": "nested","properties": {
"id": {
"type": "text","analyzer": "my_analyzer"
}
}
}
}
}
}
索引数据:
{
"group": "J00-I99","codes": [
{
"id": "J15","description": "hello world"
},{
"id": "J15.0","description": "test one world"
},{
"id": "J15.1","description": "test two world J15.0"
},{
"id": "J15.2","description": "test two three world J15"
},{
"id": "J15.3","description": "hello world J18 "
},{
"id": "J15.9","description": "hello world new"
},{
"id": "J16.0","description": "new description"
}
]
}
搜索查询:
{
"_source": {
"excludes": [
"codes"
]
},"query": {
"bool": {
"should": [
{
"match": {
"codes.description": "J15"
}
},{
"match": {
"codes.id": "J15"
}
}
],"must": {
"multi_match": {
"query": "test two","type": "phrase"
}
}
}
},"inner_hits": {}
}
}
}
搜索结果:
"inner_hits": {
"codes": {
"hits": {
"total": {
"value": 2,"relation": "eq"
},"max_score": 3.2227304,"hits": [
{
"_index": "stof_64170045","_nested": {
"field": "codes","offset": 3
},"_score": 3.2227304,"_source": {
"id": "J15.2","description": "test two three world J15"
}
},{
"_index": "stof_64170045","offset": 2
},"_score": 2.0622847,"_source": {
"id": "J15.1","description": "test two world J15.0"
}
}
]
}
}
}
}