Logstash由于错误而停止处理:SystemExit退出

问题描述

我们试图在Elasticsearch中分别索引Nginx访问和错误日​​志。为此,我们创建了如下的Filbeat和Logstash配置。

下面是我们的/etc/filebeat/filebeat.yml配置

filebeat.inputs:
- type: log
  paths:
    - /var/log/Nginx/*access*.log
  exclude_files: ['\.gz$']
  exclude_lines: ['*ELB-HealthChecker*']
  fields:
    log_type: type1 
- type: log
  paths:
    - /var/log/Nginx/*error*.log
  exclude_files: ['\.gz$']
  exclude_lines: ['*ELB-HealthChecker*']
  fields:
    log_type: type2

output.logstash:
  hosts: ["10.227.XXX.XXX:5400"]

我们的logstash文件/etc/logstash/conf.d/logstash-Nginx-es.conf的配置如下

input {
    beats {
        port => 5400
    }
}

filter {
  if ([fields][log_type] == "type1") {
    grok {
      match => [ "message","%{NginxACCESS}+%{GREEDYDATA:extra_fields}"]
      overwrite => [ "message" ]
    }
    mutate {
      convert => ["response","integer"]
      convert => ["bytes","integer"]
      convert => ["responsetime","float"]
    }
    geoip {
      source => "clientip"
      target => "geoip"
      add_tag => [ "Nginx-geoip" ]
    }
    date {
      match => [ "timestamp","dd/MMM/YYYY:HH:mm:ss Z" ]
      remove_field => [ "timestamp" ]
    }
    useragent {
      source => "user_agent"
    }
  } else {
      grok {
        match => [ "message","(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message},client: %{IP:client},server: %{GREEDYDATA:server},request: "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))"(,upstream: "%{GREEDYDATA:upstream}")?,host: "%{DATA:host}"(,referrer: "%{GREEDYDATA:referrer}")?"]
        overwrite => [ "message" ]
      }
      mutate {
        convert => ["response","integer"]
        convert => ["bytes","integer"]
        convert => ["responsetime","float"]
      }
      geoip {
        source => "clientip"
        target => "geoip"
        add_tag => [ "Nginx-geoip" ]
      }
      date {
        match => [ "timestamp","dd/MMM/YYYY:HH:mm:ss Z" ]
        remove_field => [ "timestamp" ]
      }
      useragent {
        source => "user_agent"
      }
    }
}

output {
  if ([fields][log_type] == "type1") {
    amazon_es {
      hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
      region => "ap-southeast-1"
      aws_access_key_id => 'XXXX'
      aws_secret_access_key => 'XXXX'
      index => "Nginx-access-logs-%{+YYYY.MM.dd}"
    }
} else {
    amazon_es {
      hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
      region => "ap-southeast-1"
      aws_access_key_id => 'XXXX'
      aws_secret_access_key => 'XXXX'
      index => "Nginx-error-logs-%{+YYYY.MM.dd}"
    }
  }
    stdout { 
      codec => rubydebug 
    }
}

并且在启动logstash时收到以下错误消息。

[2020-10-12T06:05:39,183][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.9.2","jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.265-b01 on 1.8.0_265-b01 +indy +jit [linux-x86_64]"}
[2020-10-12T06:05:39,861][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-10-12T06:05:41,454][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main,:exception=>"LogStash::ConfigurationError",:message=>"Expected one of [ \\t\\r\\n],\"#\",\"{\",\",\"]\" at line 32,column 263 (byte 918) after filter {\n  if ([fields][log_type] == \"type1\") {\n    grok {\n      match => [ \"message\",\"%{NginxACCESS}+%{GREEDYDATA:extra_fields}\"]\n      overwrite => [ \"message\" ]\n    }\n    mutate {\n      convert => [\"response\",\"integer\"]\n      convert => [\"bytes\",\"integer\"]\n      convert => [\"responsetime\",\"float\"]\n    }\n    geoip {\n      source => \"clientip\"\n      target => \"geoip\"\n      add_tag => [ \"Nginx-geoip\" ]\n    }\n    date {\n      match => [ \"timestamp\",\"dd/MMM/YYYY:HH:mm:ss Z\" ]\n      remove_field => [ \"timestamp\" ]\n    }\n    useragent {\n      source => \"user_agent\"\n    }\n  } else {\n      grok {\n        match => [ \"message\",\"(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \\[%{LOGLEVEL:severity}\\] %{POSINT:pid}#%{NUMBER:threadid}\\: \\*%{NUMBER:connectionid} %{GREEDYDATA:message},request: \"",:backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'","org/logstash/execution/AbstractPipelineExt.java:183:in `initialize'","org/logstash/execution/JavaBasePipelineExt.java:69:in `initialize'","/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:44:in `initialize'","/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'","/usr/share/logstash/logstash-core/lib/logstash/agent.rb:357:in `block in converge_state'"]}
[2020-10-12T06:05:41,795][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-10-12T06:05:46,685][INFO ][logstash.runner          ] Logstash shut down.
[2020-10-12T06:05:46,706][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

似乎存在一些格式问题。请帮忙解决问题

================================= 更新 ====== ============================

对于所有正在寻找用于Nginx访问和错误日​​志的强大grok过滤器的人,请尝试以下过滤器模式。

访问日志-%{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:access_time}\] \"%{WORD:http_method} %{URIPATHParaM:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{SPACE:referrer}\" \"%{DATA:agent}\" %{NUMBER:duration} req_header:\"%{DATA:req_header}\" req_body:\"%{DATA:req_body}\" resp_header:\"%{DATA:resp_header}\" resp_body:\"%{GREEDYDATA:resp_body}\"

Error_Logs-(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{DATA:errormessage},server: %{IP:server},request: \"(?<httprequest>%{WORD:httpcommand} %{NOTSPACE:httpfile} HTTP/(?<httpversion>[0-9.]*))\",host: \"%{NOTSPACE:host}\"(,referrer: \"%{NOTSPACE:referrer}\")?

解决方法

第32行的Grok模式是问题所在。需要转义所有"个字符。 下面是GROK的转义版本。

grok {
        match => [ "message","(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME})\[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message},client: %{IP:client},server: %{GREEDYDATA:server},request: \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))\"(,upstream: \"%{GREEDYDATA:upstream}\")?,host: \"%{DATA:host}\"(,referrer: \"%{GREEDYDATA:referrer}\")?"]
        overwrite => [ "message" ]
      }

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...