Elasticsearch集群40亿级优化

目前架构:

n台filebeat客户端来将每台应用上的日志传到kafka,3台kafka做集群用于日志队列,四台ES做集群,前两台存放近两天热数据日志,后两台存放两天前的历史日志,数据保存一个月,目前总数据量44亿,大小为6T。logstash与kibana与ES在一台机器上,kibana域名指向后端三个kibana做轮询。


出现性能问题:

1、集群中只有第一台负载很高,其他节点负载一直都很低,偶尔同为hot数据节点的第二台负载也会稍微有点升高。

2、队列经常堵塞,kafka中uat,pet,prd三个环境的topic同在一个认的logstash消费组。只要其中一个环境的列队积压,其他环境的队列就无法消费了。

3、Kibana登陆后首页打开,需要至少半分钟,日志查询也很慢,至少几分钟才会出结果。

4、有时候ES常因负载高而脱离集群,导致集群节点数据重新分配,集群状态颜色为RED,同时kibana页面打开时显示Red报错。kibana页面间断无法打开的情况约持续一两周。



目前ELK中发现有些索引查询有点慢,于是打开ES索引查询日志来记录慢查询,进而对慢查询日志进行分析,定位问题。慢日志内容如下:

[2017-08-28T11:21:02,377][WARN ][index.search.slowlog.query] [node-3] [logstash-Nginx-2017.08.01][4] took[15s], took_millis[15029], types[], stats[], search
_type[QUERY_THEN_FETCH], total_shards[140], source[{"size":0,"query":{"bool":{"filter":[{"match_none":{"boost":1.0}},{"query_string":{"query":"NOT status:200  OR  NOT
status:304","fields":[],"use_dis_max":true,"tie_breaker":0.0,"default_operator":"or","auto_generate_phrase_queries":false,"max_determined_states":10000,"enable_position
_increment":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"analyze_wildcard":true,"escape":false,"split_on_whitespace":true,
"boost":1.0}}],"disable_coord":false,"adjust_pure_negative":true,"boost":1.0}},"aggregations":{"3":{"terms":{"field":"status","size":5,"min_doc_count":0,"shard_min_doc_
count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_term":"asc"}]},"aggregations":{"2":{"date_histogram":{"field":"@timestamp","format":"epoch_mill
is","interval":"20m","offset":0,"order":{"_key":"asc"},"keyed":false,"min_doc_count":0,"extended_bounds":{"min":"1503886846372","max":"1503890446372"}}}}}}}],
[2017-08-28T11:21:02,377][WARN ][index.search.slowlog.query] [node-3] [logstash-Nginx-2017.08.01][2] took[15.7s], took_millis[15787], types[], stats[], sear
ch_type[QUERY_THEN_FETCH], total_shards[140], source[{"size":0,"query":{"bool":{"filter":[{"match_none":{"boost":1.0}},{"query_string":{"query":"NOT status:200  OR  NOT
  status:304","fields":[],"use_dis_max":true,"tie_breaker":0.0,"default_operator":"or","auto_generate_phrase_queries":false,"max_determined_states":10000,"enable_positi
on_increment":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"analyze_wildcard":true,"escape":false,"split_on_whitespace":tru
e,"boost":1.0}}],"disable_coord":false,"adjust_pure_negative":true,"boost":1.0}},"aggregations":{"3":{"terms":{"field":"status","size":5,"min_doc_count":0,"shard_min_do
c_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_term":"asc"}]},"aggregations":{"2":{"date_histogram":{"field":"@timestamp","format":"epoch_mi
llis","interval":"20m","offset":0,"order":{"_key":"asc"},"keyed":false,"min_doc_count":0,"extended_bounds":{"min":"1503886846372","max":"1503890446372"}}}}}}}],

下面进行分析:

待续

相关文章

这篇文章主要介绍“hive和mysql的区别是什么”,在日常操作中...
这篇“MySQL数据库如何改名”文章的知识点大部分人都不太理解...
这篇文章主要介绍“mysql版本查询命令是什么”的相关知识,小...
本篇内容介绍了“mysql怎么修改字段的内容”的有关知识,在实...
这篇文章主要讲解了“mysql怎么删除unique约束”,文中的讲解...
今天小编给大家分享一下mysql怎么查询不为空的字段的相关知识...