Kubernetes:无法访问 flannel pod

问题描述

我是 Kubernetes 的新手。我已经在 Oracle VirtualBox Manager 上设置了 3 个 Ubuntu 20.04.2 LTS VM。

我已经根据以下文档在所有 3 个虚拟机中安装了 docker、kubelet、kubeadm 和 kubectl。
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

我使用以下链接创建了集群: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

我使用以下命令设置法兰绒

$ wget https://github.com/coreos/flannel/raw/master/Documentation/kube-flannel.yml
$ kubectl create -f kube-flannel.yml

一切看起来都很好。

root@master-node:~/k8s# kubectl get nodes -o wide
NAME          STATUS   ROLES                  AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
master-node   Ready    control-plane,master   23h   v1.20.5   192.168.108.10   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   docker://19.3.15
node-1        Ready    <none>                 10h   v1.20.5   192.168.108.11   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   docker://19.3.15
node-2        Ready    <none>                 10h   v1.20.5   192.168.108.12   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   docker://19.3.15

然后我使用 3 个副本创建 Nginx 部署。

root@master-node:~/k8s# kubectl get po -o wide
NAME                            READY   STATUS    RESTARTS   AGE    IP           NODE     NOMINATED NODE   READInesS GATES
dnsutils                        1/1     Running   2          127m   10.244.2.8   node-2   <none>           <none>
Nginx-deploy-7848d4b86f-4nvg7   1/1     Running   0          9m8s   10.244.2.9   node-2   <none>           <none>
Nginx-deploy-7848d4b86f-prj7g   1/1     Running   0          9m8s   10.244.1.9   node-1   <none>           <none>
Nginx-deploy-7848d4b86f-r95hq   1/1     Running   0          9m8s   10.244.1.8   node-1   <none>           <none>

问题仅在我尝试卷曲 Nginx pod 时出现。它没有响应。

root@master-node:~/k8s# curl 10.244.2.9
^C

然后我登录到 pod 并确认 Nginx 已启动。

root@master-node:~/k8s# kubectl exec -it Nginx-deploy-7848d4b86f-4nvg7  -- /bin/bash
root@Nginx-deploy-7848d4b86f-4nvg7:/# curl 127.0.0.1
<!DOCTYPE html>
<html>
<head>
<title>Welcome to Nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma,Verdana,Arial,sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to Nginx!</h1>
<p>If you see this page,the Nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://Nginx.org/">Nginx.org</a>.<br/>
Commercial support is available at
<a href="http://Nginx.com/">Nginx.com</a>.</p>

<p><em>Thank you for using Nginx.</em></p>
</body>
</html>
root@Nginx-deploy-7848d4b86f-4nvg7:/# exit
exit

这是其中一个 pod 上 kubectl describe pod 的结果:

root@master-node:~/k8s# kubectl describe pod Nginx-deploy-7848d4b86f-4nvg7
Name:         Nginx-deploy-7848d4b86f-4nvg7
Namespace:    default
Priority:     0
Node:         node-2/192.168.108.12
Start Time:   Sun,28 Mar 2021 04:49:15 +0000
Labels:       app=Nginx
              pod-template-hash=7848d4b86f
Annotations:  <none>
Status:       Running
IP:           10.244.2.9
IPs:
  IP:           10.244.2.9
Controlled By:  replicaset/Nginx-deploy-7848d4b86f
Containers:
  Nginx:
    Container ID:   docker://f6322e65cb98e54cc220a786ffb7c967bbc07d80fe8d118a19891678109680d8
    Image:          Nginx
    Image ID:       docker-pullable://Nginx@sha256:b0ea179ab61c789ce759dbe491cc534e293428ad232d00df83ce44bf86261179
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun,28 Mar 2021 04:49:19 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xhkzx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-xhkzx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-xhkzx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  normal  Scheduled  25m   default-scheduler  Successfully assigned default/Nginx-deploy-7848d4b86f-4nvg7 to node-2
  normal  Pulling    25m   kubelet            Pulling image "Nginx"
  normal  Pulled     25m   kubelet            Successfully pulled image "Nginx" in 1.888247052s
  normal  Created    25m   kubelet            Created container Nginx
  normal  Started    25m   kubelet            Started container Nginx

我尝试使用以下方法进行故障排除:Debugging Kubernetes Networking

root@master-node:~/k8s# ip link list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNowN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s3: <broADCAST,MULTICAST,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 08:00:27:db:6f:21 brd ff:ff:ff:ff:ff:ff
3: enp0s8: <broADCAST,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 08:00:27:90:88:7c brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,broADCAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
    link/ether 02:42:1d:21:66:20 brd ff:ff:ff:ff:ff:ff
5: kube-ipvs0: <broADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default
    link/ether 4a:df:fb:be:7b:0e brd ff:ff:ff:ff:ff:ff
6: flannel.1: <broADCAST,LOWER_UP> mtu 1450 qdisc noqueue state UNKNowN mode DEFAULT group default
    link/ether 02:48:db:46:53:60 brd ff:ff:ff:ff:ff:ff
7: cni0: <broADCAST,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether fa:29:13:98:2c:31 brd ff:ff:ff:ff:ff:ff
8: vethc2e0fa86@if3: <broADCAST,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
    link/ether 7a:66:b0:97:db:81 brd ff:ff:ff:ff:ff:ff link-netnsid 0
9: veth3eb514e1@if3: <broADCAST,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
    link/ether 3e:3c:9d:20:5c:42 brd ff:ff:ff:ff:ff:ff link-netnsid 1
11: veth0@if10: <broADCAST,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:35:f0:fb:e3:b1 brd ff:ff:ff:ff:ff:ff link-netns test1
root@master-node:~/k8s# kubectl create -f nwtool-deployment.yaml
deployment.apps/nwtool-deploy created
root@master-node:~/k8s# kubectl get po
NAME                             READY   STATUS    RESTARTS   AGE
nwtool-deploy-6d8c99644b-fq6gv   1/1     Running   0          14s
nwtool-deploy-6d8c99644b-fwc6d   1/1     Running   0          14s
root@master-node:~/k8s# ^C
root@master-node:~/k8s# kubectl exec -it nwtool-deploy-6d8c99644b-fq6gv -- ip link list
1: lo: <LOOPBACK,LOWER_UP> mtu 65536 qdisc noqueue state UNKNowN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: eth0@if13: <broADCAST,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default
    link/ether 2e:02:b6:97:2f:10 brd ff:ff:ff:ff:ff:ff
root@master-node:~/k8s# kubectl exec -it nwtool-deploy-6d8c99644b-fwc6d -- ip link list
1: lo: <LOOPBACK,LOWER_UP> mtu 65536 qdisc noqueue state UNKNowN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: eth0@if14: <broADCAST,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default
    link/ether 82:21:fa:aa:34:27 brd ff:ff:ff:ff:ff:ff
root@master-node:~/k8s# ip link list
1: lo: <LOOPBACK,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:35:f0:fb:e3:b1 brd ff:ff:ff:ff:ff:ff link-netns test1
root@master-node:~/k8s#

看起来没有为主节点上的新 pod 创建 veth 对。知道如何解决这个问题吗?任何帮助将不胜感激。谢谢!

解决方法

我已经发现了这个问题。感谢:Kubernetes with Flannel — Understanding the Networking — Part 1 (Setup the demo) 我复制了以下有助于解决我的问题的摘录:

VM 将创建 2 个接口。并且,在运行 flannel 时,您需要正确提及接口名称。否则,您可能会看到 Pod 会出现并获取 IP 地址,但无法相互通信。

需要在 flannel manifest 文件中指定接口名称 enp0s8。

vagrant@master:~$ grep -A8 containers kube-flannel.yml
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.10.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        - --iface=enp0s8          ####Add the iface name here.

如果你碰巧有不同的接口要匹配,你可以用正则表达式匹配它。假设工作节点可以配置 enp0s8 或 enp0s9,那么 flannel 参数将是 — --i​​face-regex=[enp0s8|enp0s9]