calico-node pod 被杀死

问题描述

我最近发现我的 GKE 集群中有许多 Pod 被杀死

k get events -n kube-system | grep "Stopping container calico-node"
14m         normal    Killing             pod/calico-node-26rmv                                     Stopping container calico-node
29m         normal    Killing             pod/calico-node-2bz2c                                     Stopping container calico-node
10m         normal    Killing             pod/calico-node-2mjkt                                     Stopping container calico-node
26m         normal    Killing             pod/calico-node-2srrt                                     Stopping container calico-node
34m         normal    Killing             pod/calico-node-2vwz9                                     Stopping container calico-node
23m         normal    Killing             pod/calico-node-4fqdf                                     Stopping container calico-node
31m         normal    Killing             pod/calico-node-4hj2h                                     Stopping container calico-node
14m         normal    Killing             pod/calico-node-4w9fr                                     Stopping container calico-node
7m          normal    Killing             pod/calico-node-56ns7                                     Stopping container calico-node
10m         normal    Killing             pod/calico-node-5mxjh                                     Stopping container calico-node
32m         normal    Killing             pod/calico-node-65zmr                                     Stopping container calico-node
7m38s       normal    Killing             pod/calico-node-66bnz                                     Stopping container calico-node
19m         normal    Killing             pod/calico-node-66kx4                                     Stopping container calico-node
32m         normal    Killing             pod/calico-node-6bctr                                     Stopping container calico-node
38m         normal    Killing             pod/calico-node-6gq9b                                     Stopping container calico-node
29m         normal    Killing             pod/calico-node-6hjk5                                     Stopping container calico-node
15m         normal    Killing             pod/calico-node-6kn67                                     Stopping container calico-node
27m         normal    Killing             pod/calico-node-6q6cp                                     Stopping container calico-node

其中一些 Pod 部署在未启用任何自动缩放的节点池上。

从日志的角度来看,我在 pod 中看到的最后一个日志是

2021-05-10 11:23:03 
  "plugins": [
2021-05-10 11:23:03 
  "cniVersion": "0.3.1",2021-05-10 11:23:03 
  "name": "k8s-pod-network",2021-05-10 11:23:03 
CNI config: {
2021-05-10 11:23:03 
Using CNI config template from CNI_NETWORK_CONfig environment variable.
2021-05-10 11:23:03 
/host/secondary-bin-dir is non-writeable,skipping
2021-05-10 11:23:03 
CNI plugin version: v3.8.8-1-gke.2
2021-05-10 11:23:03 
Wrote Calico CNI binaries to /host/opt/cni/bin
2021-05-10 11:23:03 
ls: cannot access '/calico-secrets': No such file or directory
2021-05-10 11:22:53 
No Calico CNI spec template is specified. Exiting (0)...
2021-05-10 11:22:53 
Calico Network Policy is enabled
2021-05-10 11:22:53 
Calico network policy config:  true

我如何进一步调查可能的原因?由于这是 GKE,我无权访问 /var/log/kube-scheduler.log

在 GCP 日志记录中,我看到的只有

enter image description here

我已经验证过了。 没有 OOM 错误

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)