如何在FluentD中拆分AWS CloudTrail JSON

问题描述

我有以下JSON,我需要在FluentD中将称为Records的数组拆分为不同的消息。我之所以要这样做,是因为该数组中的所有元素都被提取到ES中的同一文档中,而我只能看到Kibana中的第一个元素。

“ JSON是一个单独的哈希,键“ Records”指向一个哈希,该键的单个键“ message”包含一个数组,其中每个元素代表一个API事件。FluentD现在正在接收这些文件,但仅报告一个事件每个文件中。”

我尝试过的事没有成功:

有人知道如何解决问题吗?

{
  "_index": "abcd","_type": "abcd","_id": "abcd","_version": 1,"_score": null,"_source": {
    "Records": [
      {
        "eventVersion": "1.05","userIdentity": {
          "type": "abcd","principalId": "abcd","accountId": "1234"
        },"eventTime": "2020-11-11T09:18:34Z","eventSource": "abcd","eventName": "abcd","awsRegion": "us-east-1","sourceIPAddress": "x.x.x.x","userAgent": "abcd","requestParameters": {
          "roleArn": "abcd","roleSessionName": "abcd","externalId": "1234"
        },"responseElements": {
          "credentials": {
            "accessKeyId": "","expiration": "Nov 11,2020 10:18:34 AM","sessionToken": ""
          },"assumedRoleUser": {
            "assumedRoleId": "abcd","arn": "abcd"
          }
        },"requestID": "0f34e4e7-0869-44ec-8185-189aa074ff23","eventID": "d205f07f-1f30-4ba1-b99f-3fb929cdb9b7","resources": [
          {
            "accountId": "123","type": "abcd","ARN": "abcd"
          }
        ],"eventType": "AwsApiCall","recipientAccountId": "1234","sharedEventID": "d10ccd8d-0489-4e56-9453-e3e3b00915d3"
      },{
        "eventVersion": "1.05","arn": "abcd","accountId": "1234","accessKeyId": "","sessionContext": {
            "sessionIssuer": {
              "type": "abcd","principalId": "","userName": "abcd"
            },"webIdFederationData": {},"attributes": {
              "mfaAuthenticated": "false","creationDate": "2020-11-11T08:35:54Z"
            }
          }
        },"eventTime": "2020-11-11T09:18:17Z","errorCode": "abcd","errorMessage": "abcd","requestParameters": null,"responseElements": null,"requestID": "cf92658d-c91b-cac4-97b7-cb14cd3db39a","eventID": "dacac287-f47b-4299-94dd-d4b05b47325b","eventType": "abcd","recipientAccountId": "123"
      }
    ],"@timestamp": "2020-11-11T09:29:46.023523479+00:00","@log_name": "cloudtrail.logs"
  },"fields": {
    "@timestamp": [
      "2020-11-11T09:29:46.023Z"
    ]
  },"sort": [
    1234
  ]
}

解决方法

如果您有相同的问题,答案是以下插件fluent-plugin-record_splitter

示例:

<match raw.cloudtrail.logs*>
  @type record_splitter
  tag cloudtrail.logs
  split_key Records 
</match>