Prometheus Operator - 启用 Istio 监控时 OOM 被杀死

问题描述

我想请教您 - 在启用 Istio 指标监控时，如何防止 Prometheus 因内存不足而被杀死？我使用 Prometheus Operator，并且在我从 this article by Prune on Medium 为 Istio 创建 ServiceMonitors 之前，指标监控工作正常。 From the article 它们如下：

数据平面的 ServiceMonitor：

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
Metadata:
  name: prometheus-oper-istio-dataplane
  labels:
    monitoring: istio-dataplane
    release: prometheus
spec:
  selector:
    matchExpressions:
      - {key: istio-prometheus-ignore,operator: DoesNotExist}
  namespaceSelector:
    any: true
  jobLabel: envoy-stats
  endpoints:
  - path: /stats/prometheus
    targetPort: http-envoy-prom
    interval: 15s
    relabelings:
    - sourceLabels: [__Meta_kubernetes_pod_container_port_name]
      action: keep
      regex: '.*-envoy-prom'
    - action: labelmap
      regex: "__Meta_kubernetes_pod_label_(.+)"
    - sourceLabels: [__Meta_kubernetes_namespace]
      action: replace
      targetLabel: namespace
    - sourceLabels: [__Meta_kubernetes_pod_name]
      action: replace
      targetLabel: pod_name

用于控制平面的 ServiceMonitor：

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
Metadata:
  name: prometheus-oper-istio-controlplane
  labels:
    release: prometheus
spec:
  jobLabel: istio
  selector:
    matchExpressions:
      - {key: istio,operator: In,values: [mixer,pilot,galley,citadel,sidecar-injector]}
  namespaceSelector:
    any: true
  endpoints:
  - port: http-monitoring
    interval: 15s
  - port: http-policy-monitoring
    interval: 15s

创建 Istio 数据平面的 ServiceMonitor 后，内存使用量在一分钟内从大约 10GB 增加到 30GB，并且 Prometheus 副本被 Kubernetes 杀死。 cpu使用率正常。如何防止资源使用量如此巨大的增加？重新贴标签有什么问题吗？它应该从大约 500 个端点抓取指标。

[编辑]

从调查来看，对资源使用有很大影响的似乎是重新标记。例如，如果我将 targetLabel 更改为 pod 而不是 pod_name，资源使用量会立即增加。

无论如何，我没有找到这个问题的解决方案。我使用了半官方的 ServiceMonitor and the PodMonitor provided by the Istio on GithHub，但它只是让 Prometheus 在 Out Of Memory Exception 之前运行更长时间。现在，从大约 10GB 到 32GB 的内存使用量大约需要一个小时。

我看到的是，在启用 Istio 指标后，时间序列的数量增长得非常快，而且永不停息，在我看来，这看起来像是内存泄漏。在启用 Istio 监控之前，这个数字相当稳定。

您还有其他建议吗？

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

istio istio-kiali istio-prometheus prometheus prometheus-operator