Hi @superseb,
instead of pinging some external site, I changed my test to verify functionality: curl -k https://kubernetes
sometimes succeeds and sometimes fails with curl: (6) Could not resolve host: kubernetes
. Depending which coreDNS
service is queried. If it’s the local one the name resolution succeeds, if it is the one running on the other node it fails.
Please find below some more logs of coreDNS and a tcpdump tracing the communication.
BACKEND (core01)
10:27:29.837759 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 10649+ A? kubernetes.default.svc.cluster.local. (54)
10:27:29.837772 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 59807+ AAAA? kubernetes.default.svc.cluster.local. (54)
10:27:29.838427 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 10649*- 1/0/0 A 10.43.0.1 (106)
10:27:29.838445 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 59807*- 0/1/0 (147)
10:27:34.840202 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 10649+ A? kubernetes.default.svc.cluster.local. (54)
10:27:34.840243 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 59807+ AAAA? kubernetes.default.svc.cluster.local. (54)
10:27:34.840894 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 10649*- 1/0/0 A 10.43.0.1 (106)
10:27:34.841078 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 59807*- 0/1/0 (147)
10:27:39.845369 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 23883+ A? kubernetes. (28)
10:27:39.845404 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 25938+ AAAA? kubernetes. (28)
10:27:39.861660 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 23883 NXDomain 0/1/0 (103)
10:27:39.865593 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 25938 NXDomain 0/1/0 (103)
10:27:44.850484 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 23883+ A? kubernetes. (28)
10:27:44.850517 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 25938+ AAAA? kubernetes. (28)
10:27:44.851205 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 23883 NXDomain* 0/1/0 (103)
10:27:44.851232 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 25938 NXDomain* 0/1/0 (103)
FRONTEND
10:27:29.839175 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 10649+ A? kubernetes.default.svc.cluster.local. (54)
10:27:29.839221 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 59807+ AAAA? kubernetes.default.svc.cluster.local. (54)
10:27:29.839453 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 10649*- 1/0/0 A 10.43.0.1 (106)
10:27:29.839549 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 59807*- 0/1/0 (147)
10:27:34.841637 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 10649+ A? kubernetes.default.svc.cluster.local. (54)
10:27:34.841680 IP 172.30.0.2.47905 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.40100 > 10.42.1.2.domain: 59807+ AAAA? kubernetes.default.svc.cluster.local. (54)
10:27:34.841963 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 10649*- 1/0/0 A 10.43.0.1 (106)
10:27:34.842147 IP 192.168.225.2.34713 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.40100: 59807*- 0/1/0 (147)
10:27:39.846767 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 23883+ A? kubernetes. (28)
10:27:39.846811 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 25938+ AAAA? kubernetes. (28)
10:27:39.862540 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 23883 NXDomain 0/1/0 (103)
10:27:39.866623 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 25938 NXDomain 0/1/0 (103)
10:27:44.851909 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 23883+ A? kubernetes. (28)
10:27:44.851953 IP 172.30.0.2.34582 > 192.168.225.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.7.44551 > 10.42.1.2.domain: 25938+ AAAA? kubernetes. (28)
10:27:44.852138 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 23883 NXDomain* 0/1/0 (103)
10:27:44.852219 IP 192.168.225.2.24000 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.2.domain > 10.42.0.7.44551: 25938 NXDomain* 0/1/0 (103)
coreDNS
[INFO] 10.42.0.7:40100 - 10649 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000162801s
[INFO] 10.42.0.7:40100 - 59807 "AAAA IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000258501s
[INFO] 10.42.0.7:40100 - 10649 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.0002726s
[INFO] 10.42.0.7:40100 - 59807 "AAAA IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000397001s
[INFO] 10.42.0.7:44551 - 23883 "A IN kubernetes. udp 28 false 512" NXDOMAIN qr,rd,ra 103 0.015610524s
[INFO] 10.42.0.7:44551 - 25938 "AAAA IN kubernetes. udp 28 false 512" NXDOMAIN qr,rd,ra 103 0.019523331s
[INFO] 10.42.0.7:44551 - 23883 "A IN kubernetes. udp 28 false 512" NXDOMAIN qr,aa,rd,ra 103 0.0001025s
[INFO] 10.42.0.7:44551 - 25938 "AAAA IN kubernetes. udp 28 false 512" NXDOMAIN qr,aa,rd,ra 103 0.0001603s
coreDNS (successfully querying the local instance)
[INFO] 10.42.0.7:60763 - 55668 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000145701s
[INFO] 10.42.0.7:60763 - 2427 "AAAA IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.0000924s
coreDNS configmap
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
Corefile: |
.:53 {
log
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . "/etc/resolv.conf"
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"Corefile":".:53 {\n errors\n health {\n lameduck 5s\n }\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n pods insecure\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . \"/etc/resolv.conf\"\n cache 30\n loop\n reload\n loadbalance\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"coredns","namespace":"kube-system"}}
creationTimestamp: "2020-08-05T07:38:55Z"
name: coredns
namespace: kube-system
resourceVersion: "7177"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
uid: 63cbe102-c521-433f-81b2-3627b4e7a36e
Using kube-dns
instead of coreDNS
doesn’t make a difference. The issue remains.