通过 Logstash 将 JSON 从 CouchDB 解析到 ElasticSearch

问题描述

我无法以所需的方式将 JSON 从 CouchDB 解析为 Elasticsearch 索引。 我的 CouchDB 数据如下所示:

{
  "_id": "56161609157031561692637","_rev": "4-4119e8df293a6354be4c9fd7e8b12e68","deleteFlag": "N","entryUser": "John","parameter": "{\"id\":\"14188\",\"rcs_p\":null,\"rcs_e\":null,\"dep_p\":null,\"dep_e\":null,\"dep_place\":null,\"rcf_p\":null,\"rcf_e\":null,\"rcf_place\":null,\"dlv_p\":\"3810\",\"dlv_e\":\"1569\",\"seg_no\":null,\"trans_type\":\"incoming\",\"trans_service\":\"delivery\"}","physicalId": "0","recordDate": "2020-12-28T17:50:16+05:45","tag": "CARGO","uId": "56161609157031561692637","~version": "CgMBKgA="
}

我想要做的是能够使用上述 JSON 参数的嵌套字段进行搜索。 当我将数据放入 ES 索引时,它的存储方式如下:

{
  "_index": "del3","_type": "_doc","_id": "XRCV9XYBx5PRwauO--qO","_version": 1,"_score": 0,"_source": {
    "@version": "1","doc_as_upsert": true,"doc": {
      "physicalId": "0","recordDate": "2020-12-27T12:56:45+05:45","~version": "CgMBGgA=","uId": "48541609052212485430933","_rev": "3-937bf92e6010afec13664b1d9d06844b","parameter": "{\"id\":\"4038\",\"dlv_p\":\"5070\",\"dlv_e\":\"2015\",\"trans_service\":\"delivery\"}"
    },"@timestamp": "2021-01-12T07:53:33.978Z"
  },"fields": {
    "@timestamp": [
      "2021-01-12T07:53:33.978Z"
    ],"doc.recordDate": [
      "2020-12-27T07:11:45.000Z"
    ]
  }
}

我希望能够访问 Elasticsearch 中参数(id、rcs_p、rcs_e、..)中的字段。

这是我的 logstash.conf 文件

input {
    couchdb_changes {
        host => "<host_name>"
        port => 5984
        db => "mychannel_asset$management"
        keep_id => false
        keep_revision => true
        #initial_sequence => 0
        always_reconnect => true
        sequence_path => "/usr/share/logstash/config/seqfile"
    }
}

filter {
        json {
                source => "[parameter]"
                remove_field => ["[parameter]"]
        }
}

output {
    if([doc][tag] == "CARGO") {
        elasticsearch {
            hosts => ["http://elasticsearch:9200"]
            index => "del3"
            user => elastic
            password => changeme
        }
    }
}

我如何达到我想要的结果?我还尝试通过为参数定义嵌套类型来创建自定义模板,但还没有运气。任何帮助将不胜感激。

解决方法

我认为你几乎所有的事情都做对了。我对实际结构不太确定,但其中一种可能有效:

filter {
    json {
        source => "parameter"
        target => "parameter"
    }
}
filter {
    json {
        source => "[doc][parameter]"
        target => "[doc][parameter]"
    }
}

我不知道 CouchDB 源输入插件是如何工作的,但它似乎将所有内容都放在 doc 对象下。