Hi,
I’m very new to k3s and kubernetes - so I’m very certain I’m missing a bit of understanding. Any help to clear the view is appreciated.
Problem statement:
For personal education I set up a three node k3s cluster & try to deploy rook-ceph there.
The HowTos I followed seem to say that’s very straight forward, however I get stuck very early.
2021-01-08 09:25:45.411225 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2021-01-08 09:25:45.411415 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2021-01-08 09:25:46.011159 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2021-01-08 09:25:46.011347 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2021-01-08 09:25:46.028089 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed
2021-01-08 09:25:46.222896 I | op-k8sutil: deployment "rook-ceph-mon-a" did not change, nothing to update
2021-01-08 09:25:46.222968 I | op-mon: waiting for mon quorum with [a]
2021-01-08 09:25:46.414837 I | op-mon: mons running: [a]
2021-01-08 09:25:51.529666 I | op-mon: mons running: [a]
[..]
2021-01-08 09:28:09.617539 I | op-mon: mons running: [a]
2021-01-08 09:28:14.735945 I | op-mon: mons running: [a]
2021-01-08 09:28:14.854753 E | ceph-cluster-controller: failed to reconcile. failed to reconcile cluster "my-cluster": failed to configure local ceph cluster: failed to create cluster: failed to start ceph monitors: failed to start mon pods: failed to check mon quorum a: failed to wait for mon quorum: exceeded max retry count waiting for monitors to reach quorum
Those are Howtos / information I followed. Neither mentiones special network requirements.
https://rook.io/docs/rook/v1.5/ceph-quickstart.html
The overlay network test as described here fails on my environment:
rancher(dot)com/docs/rancher/v2.x/en/troubleshooting/networking/
Setup:
- 3x x86 nodes
- multicore
- 8GB mem each
- Fedora 33
- SSD
Setup:
# master
curl -sfL https://get.k3s.io | sh -
# agents
SECRET_TOKEN=`ssh k3s1 -f 'sudo cat /var/lib/rancher/k3s/server/node-token'`
curl -sfL https://get.k3s.io | K3S_URL=https://k3s1:6443 K3S_TOKEN=$SECRET_TOKEN sh -
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k3s1 Ready control-plane,master,node 3d2h v1.20.0+k3s2
k3s2 Ready node 3d2h v1.20.0+k3s2
k3s3 Ready node 3d2h v1.20.0+k3s2
# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system metrics-server-86cbb8457f-8ldjs 1/1 Running 0 3d2h
kube-system helm-install-traefik-z6gj5 0/1 Completed 0 3d2h
kube-system svclb-traefik-h298x 2/2 Running 0 3d2h
kube-system svclb-traefik-2brrp 2/2 Running 0 3d2h
kube-system svclb-traefik-rmfh2 2/2 Running 0 3d2h
default overlaytest-zjtbn 1/1 Running 0 43m
default overlaytest-n8ds8 1/1 Running 0 43m
default overlaytest-qpjk8 1/1 Running 0 43m
kube-system coredns-854c77959c-sjp94 1/1 Running 0 3d2h
kube-system local-path-provisioner-7c458769fb-vqsr6 1/1 Running 3 3d2h
kube-system traefik-6f9cbd9bd4-fw4ws 1/1 Running 0 3d2h
Questions:
- Flannel is still the default overlay network implementation?
- Did i miss steps to setup my custom CNI?
- The cluster.yaml of rook-ceph has a network section, but the options ‘network:provider:host|multus’ are unclear to me, so left them on default
- Is there a command / way to get details on the current network setup of my k3s cluster?
# kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}'
10.42.0.0/24 10.42.1.0/24 10.42.2.0/24