问题描述
我需要防止某些具有诸如“ null”(作为字符串为null)和""
(空字符串)之类的值的字段在Elasticsearch中建立索引,即,我应该能够获取没有在_source字段中包含这些字段。
建立索引时,映射中是否需要任何设置,例如在字段上使用自定义分析器?
P.S:-我正在使用Elasticsearch 7.6.1
我尝试了以下答案,这是行不通的-
{ "settings": {
"number_of_shards": "5","analysis": {
"normalizer": {
"my_normalizer": {
"char_filter": [
{
"type": "mapping","mappings": [
"null =>","\"\"\" =>"
]
}
],"filter": [
"uppercase"
],"type": "custom"
}
}
},"number_of_replicas": "1"}}
响应错误-序列化设置中仅允许值列表
即使我尝试了以下设置,也没有得到预期的结果:
{ "settings": {
"number_of_shards": "5","analysis": {
"char_filter": {
"my_filter": {
"type": "mapping","mappings": [
"null =>","\"\"\" =>"
]
}
},"normalizer": {
"my_normalizer": {
"char_filter": [
"my_filter"
],"number_of_replicas": "1"}}
请求- 获取 索引名/ _分析
{"normalizer":"my_normalizer","text":"null"}
响应-
{
"tokens": [
{
"token": "","start_offset": 4,"end_offset": 4,"type": "word","position": 0
}
]
}
预期的响应-
{
"tokens": []
}
解决方法
在分析器定义中使用mapping char filter即可实现,下面是工作示例。
分析API
{
"tokenizer": "standard","char_filter": [
{
"type": "mapping","mappings": [
"null =>","\"\"\" =>"
]
}
],"text": "null" or "" --> note this
}
并返回令牌
{
"tokens": []
}