kubernetes:无法加载现有证书 apiserver-etcd-client:

问题描述

我的集群证书已过期,现在无法执行任何 kubectls 命令。

root@node1:~# kubectl get ns
Unable to connect to the server: x509: certificate has expired or is not yet valid
root@node1:~# 

我使用 Kubespray 创建了这个集群,kubeadm 版本是 v1.16.3 和 kubernetesVersion v1.16.3

root@node1:~# kubeadm alpha certs check-expiration
failed to load existing certificate apiserver-etcd-client: open /etc/kubernetes/pki/apiserver-etcd-client.crt: no such file or directory
To see the stack trace of this error execute with --v=5 or higher
root@node1:~# 

并且发现/etc/kubernetes/pki目录下缺少apiserver-etcd-client.crt和apiserver-etcd-client.key文件。

root@node1:/etc/kubernetes/pki# ls -ltr
total 72
-rw------- 1 root root 1679 Jan 24 2020 ca.key
-rw-r--r-- 1 root root 1025 Jan 24 2020 ca.crt
-rw-r----- 1 root root 1679 Jan 24 2020 apiserver.key.old
-rw-r----- 1 root root 1513 Jan 24 2020 apiserver.crt.old
-rw------- 1 root root 1679 Jan 24 2020 apiserver.key
-rw-r--r-- 1 root root 1513 Jan 24 2020 apiserver.crt
-rw------- 1 root root 1675 Jan 24 2020 apiserver-kubelet-client.key
-rw-r--r-- 1 root root 1099 Jan 24 2020 apiserver-kubelet-client.crt
-rw-r----- 1 root root 1675 Jan 24 2020 apiserver-kubelet-client.key.old
-rw-r----- 1 root root 1099 Jan 24 2020 apiserver-kubelet-client.crt.old
-rw------- 1 root root 1679 Jan 24 2020 front-proxy-ca.key
-rw-r--r-- 1 root root 1038 Jan 24 2020 front-proxy-ca.crt
-rw-r----- 1 root root 1675 Jan 24 2020 front-proxy-client.key.old
-rw-r----- 1 root root 1058 Jan 24 2020 front-proxy-client.crt.old
-rw------- 1 root root 1675 Jan 24 2020 front-proxy-client.key
-rw-r--r-- 1 root root 1058 Jan 24 2020 front-proxy-client.crt
-rw------- 1 root root 451 Jan 24 2020 sa.pub
-rw------- 1 root root 1679 Jan 24 2020 sa.key
root@node1:/etc/kubernetes/pki#

我尝试了以下命令,但没有任何效果并显示错误:

#sudo kubeadm alpha certs renew all
#kubeadm alpha phase certs apiserver-etcd-client
#kubeadm alpha certs apiserver-etcd-client --config /etc/kubernetes/kubeadm-config.yaml

Kubespray 命令:

#ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml

以上命令以以下错误结束:

失败! => {"attempts": 5,"changed": true,"cmd": ["/usr/local/bin/kubeadm","--kubeconfig","/etc/kubernetes/admin.conf","token ","create"],"delta": "0:01:15.058756","end": "2021-02-05 13:32:51.656901","msg": "非零返回码","rc ": 1,"start": "2021-02-05 13:31:36.598145","stderr": "timed out waiting for the condition\n要查看此错误的堆栈跟踪,请使用 --v=5 或更高版本执行","stderr_lines": ["timed out waiting for the condition","要查看此错误的堆栈跟踪,请使用 --v=5 或更高版本执行"],"stdout": "","stdout_lines": [] }

# cat /etc/kubernetes/kubeadm-config.yaml 
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: master1_IP
  bindPort: 6443
certificateKey: xxx
nodeRegistration:
  name: node1
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
  criSocket: /var/run/dockershim.sock
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: cluster.local
etcd:
  external:
      endpoints:
      - https://master1:2379
      - https://master2:2379
      - https://master3:2379
      caFile: /etc/ssl/etcd/ssl/ca.pem
      certFile: /etc/ssl/etcd/ssl/node-node1.pem
      keyFile: /etc/ssl/etcd/ssl/node-node1-key.pem
dns:
  type: CoreDNS
  imageRepository: docker.io/coredns
  imageTag: 1.6.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: IP/18
  podSubnet: IP/18
kubernetesVersion: v1.16.3
controlPlaneEndpoint: master1_IP:6443
certificatesDir: /etc/kubernetes/ssl
imageRepository: gcr.io/google-containers
apiServer:

解决方法

首先您需要更新过期的证书,使用 kubeadm 执行此操作:

kubeadm alpha certs renew apiserver
kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew front-proxy-client

接下来生成新的 kubeconfig 文件:

kubeadm alpha kubeconfig user --client-name kubernetes-admin --org system:masters > /etc/kubernetes/admin.conf
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
# instead of $(hostname) you may need to pass the name of the master node as in "/etc/kubernetes/kubelet.conf" file.
kubeadm alpha kubeconfig user --client-name system:node:$(hostname) --org system:nodes > /etc/kubernetes/kubelet.conf 
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf

复制新的 kubernetes-admin kubeconfig 文件:

cp /etc/kubernetes/admin.conf ~/.kube/config

最后您需要重新启动:kube-apiserverkube-controller-managerkube-scheduler。您可以使用以下命令或重新启动主节点:

sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

此外,您还可以找到有关 githubthis answer 的更多信息,可能对您有很大帮助。

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...