问题描述
我已经在Elasticsearch索引products_idx1
中添加了15,000条记录,并键入了product
。
在记录产品名称时,例如apple iphone 6
,因此当我搜索iphone6
时会返回空数据。
<?PHP
use Elasticsearch\ClientBuilder;
require 'vendor/autoload.PHP';
$client = ClientBuilder::create()->build();
$values =['name','name.prefix','name.suffix','sku'];
$params =
[
'client'=>['verify'=>1,'connect_timeout'=>5],'from'=> 0,'size'=>25,'body' =>[
'query' => [
'bool'=>
[
'should'=> [[
'multi_match'=> ['query'=>'iphone6','type'=>'cross_fields','fields'=>$values,'operator'=>'OR']
],['match'=>['all'=>['query'=>'iphone6','operator'=>'OR','fuzziness'=>'AUTO'] ]]
]
]
],'sort'=>['_score'=>['order'=>'desc']],],'index'=>'products_idx1'
];
$response = $client->search($params);
echo "<pre>";print_r($response);
解决方法
使用shingle和pattern_replace token filter,可以获取问题和评论(又名iphone
,iphone6
和{{1 }}和下面是它的完整示例。
如评论中所述,您从搜索字词生成的搜索时间标记应与从索引文档生成的索引时间标记匹配,以获取搜索结果,而这正是我通过创建自定义项实现的分析器。
索引映射
appleiphone
为示例文档建立索引
{
"settings": {
"analysis": {
"analyzer": {
"text_analyzer": {
"tokenizer": "standard","filter": [
"shingle","lowercase","space_filter"
]
}
},"filter": {
"space_filter": {
"type": "pattern_replace","pattern": " ","replacement": "","preserve_original": true
}
}
}
},"mappings": {
"properties": {
"title": {
"type": "text","analyzer": "text_analyzer"
}
}
}
}
对{
"title" : "apple iphone 6"
}
的搜索查询及其结果
appleiphone
结果
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "appleiphone"
}
}
]
}
}
}
搜索带有结果的"hits": [
{
"_index": "ana","_type": "_doc","_id": "1","_score": 0.3439677,"_source": {
"title": "apple iphone 6","title_normal": "apple iphone 6"
}
}
]
的查询
iphone6
结果
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "iphone6"
}
}
]
}
}
}
对 "hits": [
{
"_index": "ana","title_normal": "apple iphone 6"
}
}
]
的最后但并非最不重要的搜索查询
iphone
结果
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "iphone"
}
}
]
}
}
}
,
由于我的答案已经非常大了,出于可读性的原因以及对Elasticsearch和analyze API中的分析器不太熟悉的人们,在另一个答案中添加有关how it works的信息。
在我上一个答案的评论中,@ Niraj提到了其他文档正在运行,但是他遇到iphone6
查询问题,因此为了调试问题, anlyze API 非常有用。
首先检查您认为与您的搜索查询匹配的文档中存在的索引时间标记,在这种情况下为apple iphone 6
PUT http:// {{hostname}}:{{port}} / {{index}} / _analyze
{
"text" : "apple iphone 6","analyzer" : "text_analyzer"
}
并生成令牌
{
"tokens": [
{
"token": "apple","start_offset": 0,"end_offset": 5,"type": "<ALPHANUM>","position": 0
},{
"token": "appleiphone","end_offset": 12,"type": "shingle","position": 0,"positionLength": 2
},{
"token": "iphone","start_offset": 6,"position": 1
},{
"token": "iphone6",//note this carefully
"start_offset": 6,"end_offset": 14,"position": 1,{
"token": "6","start_offset": 13,"type": "<NUM>","position": 2
}
]
}
现在您可以看到我们使用的分析仪也创建了iphone6
作为令牌,现在检查搜索时间令牌
{
"text" : "iphone6","analyzer" : "text_analyzer"
}
和令牌
{
"tokens": [
{
"token": "iphone6","end_offset": 7,"position": 0
}
]
}
现在您可以注意到搜索令牌还创建了iphone6
作为令牌,该令牌也出现在索引时间令牌中,因此这就是它与我在完整示例中已经显示的搜索查询相匹配的原因在第一个答案中给出