问题描述
我在虚拟机(Debian 10)中安装了一个干净的K8s集群。安装并集成到我的景观中之后,我检查了我的测试高山图像中的连通性。结果,传出流量的连接无法正常工作,并且coreDNS日志中没有信息。我在构建映像上使用了变通办法来覆盖/etc/resolv.conf并替换DNS条目(例如,将1.1.1.1设置为Nameserver)。经过短暂的“ hack”之后,与互联网的连接就可以正常工作了。但是,解决方法不是长期的解决方案,我想使用官方的方法。在K8s coreDNS的文档中,我找到了转发部分,并将标志解释为一个选项,以将查询转发到预定义的本地解析器。我认为转发到本地resolv.conf和解析过程无法正常工作。谁能帮我解决这个问题?
基本设置:
- K8s版本:1.19.0
- K8s设置:1个主节点+ 2个工作节点
- 基于:Debian 10 VM's
- CNI:法兰绒
CoreDNS Pod的状态
kube-system coredns-xxxx 1/1 Running 1 26h
kube-system coredns-yyyy 1/1 Running 1 26h
CoreDNS日志:
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
CoreDNS配置:
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
creationTimestamp: ""
name: coredns
namespace: kube-system
resourceVersion: "219"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
uid: xxx
Ouput高山图片:
/ # nslookup -debug google.de
;; connection timed out; no servers could be reached
pod的输出resolv.conf
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search development.svc.cluster.local svc.cluster.local cluster.local invalid
options ndots:5
主机resolv.conf的输出
cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 213.136.95.11
nameserver 213.136.95.10
search invalid
主机/run/flannel/subnet.env的输出
cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
kubectl get pods -n kube-system -o wide
的输出
coredns-54694b8f47-4sm4t 1/1 Running 0 14d 10.244.1.48 xxx3-node-1 <none> <none>
coredns-54694b8f47-6c7zh 1/1 Running 0 14d 10.244.0.43 xxx2-master <none> <none>
coredns-54694b8f47-lcthf 1/1 Running 0 14d 10.244.2.88 xxx4-node-2 <none> <none>
etcd-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-apiserver-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-controller-manager-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-flannel-ds-amd64-4w8zl 1/1 Running 8 28d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-flannel-ds-amd64-w7m44 1/1 Running 7 28d xxx.xx.xx.xxx xxx3-node-1 <none> <none>
kube-flannel-ds-amd64-xztqm 1/1 Running 6 28d xxx.xx.xx.xxx xxx4-node-2 <none> <none>
kube-proxy-dfs85 1/1 Running 4 28d xxx.xx.xx.xxx xxx4-node-2 <none> <none>
kube-proxy-m4hl2 1/1 Running 4 28d xxx.xx.xx.xxx xxx3-node-1 <none> <none>
kube-proxy-s7p4s 1/1 Running 8 28d xxx.xx.xx.xxx xxx2-master <none> <none>
kube-scheduler-xxx2-master 1/1 Running 7 27d xxx.xx.xx.xxx xxx2-master <none> <none>
解决方法
问题:
(两个)coreDNS吊舱仅部署在主节点上。您可以使用此命令检查设置。
kubectl get pods -n kube-system -o wide | grep coredns
解决方案:
我可以通过扩大coreDNS pod并编辑部署配置来解决该问题。必须执行以下命令。
-
kubectl edit deployment coredns -n kube-system
-
Set replicas value to node quantity e.g. 3
-
kubectl patch deployment coredns -n kube-system -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"force-update/updated-at\":\"$(date +%s)\"}}}}}"
-
kubectl get pods -n kube-system -o wide | grep coredns
来源
提示
如果您的coreDNS仍然有问题,并且DNS解析偶尔起作用,请查看此post。