[Rancher 2.6.3] Configured cacerts checksum does not match given --ca-checksum when using metallb and cert manager

Rancher Server Setup

  • Rancher version: 2.6.3
  • Installation option (Docker install/Helm Chart): Helm Chart
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc):
  • Proxy/Cert Details:

Information about the Cluster

  • Kubernetes version: v1.21.9+rke2r1
  • Cluster Type (Local/Downstream):
    • Downstream installed using RKE2

User Information

  • Ubuntu version: 20.04.1 LTS (GNU/Linux 5.4.0-42-generic x86_64)

Describe the bug

After run the registration command, I got error log like below.

rancher@test-node:~$ curl --insecure -fL https://192.168.105.200/system-agent-install.sh | sudo  sh -s - --server https://192.168.105.200 --label 'cattle.io/os=linux' --token kfkwfjxfnt8rdnm8zpl8kbr5rv269x7brcwgnmnpgd594kkbhxdvv4 --ca-checksum 543edb437be8e3b68c60bb09fc27bde24f26ce62bec2e44e182681c2df6ed06b --etcd --controlplane --worker
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 27619    0 27619    0     0   586k      0 --:--:-- --:--:-- --:--:--  599k
[INFO]  Label: cattle.io/os=linux
[INFO]  Role requested: etcd
[INFO]  Role requested: controlplane
[INFO]  Role requested: worker
[INFO]  Using default agent configuration directory /etc/rancher/agent
[INFO]  Using default agent var directory /var/lib/rancher/agent
[INFO]  Determined CA is necessary to connect to Rancher
[INFO]  Successfully downloaded CA certificate
[INFO]  Value from https://192.168.105.200/cacerts is an x509 certificate
[ERROR]  Configured cacerts checksum (1382944946dbe8c6faf7d0bd6d33d6593f3416579e75efa6ad852c2e24453016) does not match given --ca-checksum (543edb437be8e3b68c60bb09fc27bde24f26ce62bec2e44e182681c2df6ed06b)
[ERROR]  Please check if the correct certificate is configured athttps://192.168.105.200/cacerts

To Reproduce

Follow the rancher install step

  1. Add the Helm Chart Repository
  2. Create a Namespace for Rancher
  3. Install cert-manager
  4. Install Rancher with Helm with RANCHER-GENERATED CERTIFICATES
helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --set hostname=rancher.my.org \
  --set bootstrapPassword=admin
W0224 07:57:41.055633   11347 warnings.go:70] cert-manager.io/v1beta1 Issuer is deprecated in v1.4+, unavailable in v1.6+; use cert-manager.io/v1 Issuer
NAME: rancher
LAST DEPLOYED: Thu Feb 24 07:57:38 2022
NAMESPACE: cattle-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Rancher Server has been installed.

NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued, Containers are started and the Ingress rule comes up.

Check out our docs at https://rancher.com/docs/

If you provided your own bootstrap password during installation, browse to https://rancher.my.org to get started.

If this is the first time you installed Rancher, get started by running this command and clicking the URL it generates:

echo https://rancher.my.org/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}')

To get just the bootstrap password on its own, run:
kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{ "\n" }}'
  1. install metallb
  2. assign metallb ip address to rancher service
gpu@gpu:~$ kubectl get svc -n cattle-system
NAME              TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)                      AGE
rancher           LoadBalancer   10.109.59.199    192.168.105.200   80:31256/TCP,443:32178/TCP   128m
rancher-webhook   ClusterIP      10.109.245.121   <none>            443/TCP                      125m
webhook-service   ClusterIP      10.106.75.56     <none>            443/TCP                      125m
  1. Access to metallb ip address https://192.168.105.200 and setup the rancher UI
  2. create the custom RKE2 cluster
  3. copy the Registration Command and paste to other server

Result

rancher@test-node:~$ curl --insecure -fL https://192.168.105.200/system-agent-install.sh | sudo  sh -s - --server https://192.168.105.200 --label 'cattle.io/os=linux' --token kfkwfjxfnt8rdnm8zpl8kbr5rv269x7brcwgnmnpgd594kkbhxdvv4 --ca-checksum 543edb437be8e3b68c60bb09fc27bde24f26ce62bec2e44e182681c2df6ed06b --etcd --controlplane --worker
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 27619    0 27619    0     0   586k      0 --:--:-- --:--:-- --:--:--  599k
[INFO]  Label: cattle.io/os=linux
[INFO]  Role requested: etcd
[INFO]  Role requested: controlplane
[INFO]  Role requested: worker
[INFO]  Using default agent configuration directory /etc/rancher/agent
[INFO]  Using default agent var directory /var/lib/rancher/agent
[INFO]  Determined CA is necessary to connect to Rancher
[INFO]  Successfully downloaded CA certificate
[INFO]  Value from https://192.168.105.200/cacerts is an x509 certificate
[ERROR]  Configured cacerts checksum (1382944946dbe8c6faf7d0bd6d33d6593f3416579e75efa6ad852c2e24453016) does not match given --ca-checksum (543edb437be8e3b68c60bb09fc27bde24f26ce62bec2e44e182681c2df6ed06b)
[ERROR]  Please check if the correct certificate is configured athttps://192.168.105.200/cacerts

Expected Result

Cluster should become Active

Additional context

I also tried to run the RKE1 registration command and got below error message.

INFO: Arguments: --server https://192.168.105.200 --token REDACTED --ca-checksum 543edb437be8e3b68c60bb09fc27bde24f26ce62bec2e44e182681c2df6ed06b --etcd --controlplane --worker
INFO: Environment: CATTLE_ADDRESS=192.168.103.204 CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=test-node CATTLE_ROLE=,etcd,worker,controlplane CATTLE_SERVER=https://192.168.105.200 CATTLE_TOKEN=REDACTED
INFO: Using resolv.conf: nameserver 127.0.0.53 options edns0
WARN: Loopback address found in /etc/resolv.conf, please refer to the documentation how to configure your cluster to resolve DNS properly
INFO: https://192.168.105.200/ping is accessible
INFO: Value from https://192.168.105.200/v3/settings/cacerts is an x509 certificate
time="2022-02-24T08:44:35Z" level=info msg="Listening on /tmp/log.sock"
time="2022-02-24T08:44:35Z" level=info msg="Rancher agent version v2.6.3 is starting"
time="2022-02-24T08:44:35Z" level=info msg="Option worker=true"
time="2022-02-24T08:44:35Z" level=info msg="Option requestedHostname=test-node"
time="2022-02-24T08:44:35Z" level=info msg="Option dockerInfo={RDNY:QM7U:ESX7:MQV5:KK4H:UURP:B5VQ:LVVI:5QKI:MXVT:UNZV:UL6U 1 1 0 0 1 overlay2 [[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] [] {[local] [bridge host ipvlan macvlan null overlay] [] [awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} true false true true true true true true true true true true false 30 true 38 2022-02-24T08:44:35.39086874Z json-file cgroupfs 1 0 5.4.0-42-generic Ubuntu 20.04.1 LTS 20.04 linux x86_64 https://index.docker.io/v1/ 0xc0015be000 12 10467512320 [] /var/lib/docker    test-node [] false 20.10.11   map[io.containerd.runc.v2:{runc [] <nil>} io.containerd.runtime.v1.linux:{runc [] <nil>} runc:{runc [] <nil>}] runc {  inactive false  [] 0 0 <nil> []} false  docker-init {7b11cfaabd73bb80907dd23182b9347b4245eb5d 7b11cfaabd73bb80907dd23182b9347b4245eb5d} {v1.0.2-0-g52b36a2 v1.0.2-0-g52b36a2} {de40ad0 de40ad0} [name=apparmor name=seccomp,profile=default]  [] [WARNING: No swap limit support]}"
time="2022-02-24T08:44:35Z" level=info msg="Option customConfig=map[address:192.168.103.204 internalAddress: label:map[] roles:[etcd worker controlplane] taints:[]]"
time="2022-02-24T08:44:35Z" level=info msg="Option etcd=true"
time="2022-02-24T08:44:35Z" level=info msg="Option controlPlane=true"
time="2022-02-24T08:44:35Z" level=info msg="Certificate details from https://192.168.105.200"
time="2022-02-24T08:44:35Z" level=info msg="Certificate #0 (https://192.168.105.200)"
time="2022-02-24T08:44:35Z" level=info msg="Subject: CN=dynamic,O=dynamic"
time="2022-02-24T08:44:35Z" level=info msg="Issuer: CN=dynamiclistener-ca,O=dynamiclistener-org"
time="2022-02-24T08:44:35Z" level=info msg="IsCA: false"
time="2022-02-24T08:44:35Z" level=info msg="DNS Names: <none>"
time="2022-02-24T08:44:35Z" level=info msg="IPAddresses: [10.103.240.149 10.244.0.34 10.244.0.35 10.244.0.36 192.168.105.200]"
time="2022-02-24T08:44:35Z" level=info msg="NotBefore: 2022-02-24 08:06:45 +0000 UTC"
time="2022-02-24T08:44:35Z" level=info msg="NotAfter: 2023-02-24 08:14:38 +0000 UTC"
time="2022-02-24T08:44:35Z" level=info msg="SignatureAlgorithm: ECDSA-SHA256"
time="2022-02-24T08:44:35Z" level=info msg="PublicKeyAlgorithm: ECDSA"
time="2022-02-24T08:44:35Z" level=info msg="Certificate details for /etc/kubernetes/ssl/certs/serverca"
time="2022-02-24T08:44:35Z" level=info msg="Certificate #0 (/etc/kubernetes/ssl/certs/serverca)"
time="2022-02-24T08:44:35Z" level=info msg="Subject: CN=dynamiclistener-ca,O=dynamiclistener-org"
time="2022-02-24T08:44:35Z" level=info msg="Issuer: CN=dynamiclistener-ca,O=dynamiclistener-org"
time="2022-02-24T08:44:35Z" level=info msg="IsCA: true"
time="2022-02-24T08:44:35Z" level=info msg="DNS Names: <none>"
time="2022-02-24T08:44:35Z" level=info msg="IPAddresses: <none>"
time="2022-02-24T08:44:35Z" level=info msg="NotBefore: 2022-02-24 08:06:45 +0000 UTC"
time="2022-02-24T08:44:35Z" level=info msg="NotAfter: 2032-02-22 08:06:45 +0000 UTC"
time="2022-02-24T08:44:35Z" level=info msg="SignatureAlgorithm: ECDSA-SHA256"
time="2022-02-24T08:44:35Z" level=info msg="PublicKeyAlgorithm: ECDSA"
time="2022-02-24T08:44:35Z" level=fatal msg="Certificate chain is not complete, please check if all needed intermediate certificates are included in the server certificate (in the correct order) and if the cacerts setting in Rancher either contains the correct CA certificate (in the case of using self signed certificates) or is empty (in the case of using a certificate signed by a recognized CA). Certificate information is displayed above. error: Get \"https://192.168.105.200\": x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"dynamiclistener-ca\")"

When I check

https://192.168.105.200/cacerts
https://192.168.105.200/v3/settings/cacerts
# other node's RKE1 cluster's certificate
/etc/kubernetes/ssl/certs/serverca

https://192.168.105.200/cacerts certificate are not matched with both https://192.168.105.200/v3/settings/cacerts and /etc/kubernetes/ssl/certs/serverca

rancher@test-node:~$ sudo cat /etc/kubernetes/ssl/certs/serverca
# from secret tls-rancher-ingress tls.crt
-----BEGIN CERTIFICATE-----
MIIBpjCCAU2gAwIBAgIBADAKBggqhkjOPQQDAjA7MRwwGgYDVQQKExNkeW5hbWlj
bGlzdGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwHhcNMjIw
MjI0MDgwNjQ1WhcNMzIwMjIyMDgwNjQ1WjA7MRwwGgYDVQQKExNkeW5hbWljbGlz
dGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwWTATBgcqhkjO
PQIBBggqhkjOPQMBBwNCAARjtDqGqCdf7zviIoN+pm5AJDIAzyu43r77yVykDjek
sLjIWrF4EFoFJX4y8mDbimCbqs8gI30gwDg34DWN2V2Jo0IwQDAOBgNVHQ8BAf8E
BAMCAqQwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQU/3xAlKB6NQ9iZVESnZpD
A/h85/8wCgYIKoZIzj0EAwIDRwAwRAIgR0fDy3RlHdAt5yqykVpY+d9Uu+iRMK6r
6XmO9mTT+C0CIHkWA3jc7iaPyBpwQ8XEaiJk/qNzds+yPz1Col0PsZN6
-----END CERTIFICATE-----
rancher@test-node:~$ curl --insecure -fL https://192.168.105.200/cacerts
# from tls-rancher-internal-ca tls.crt
-----BEGIN CERTIFICATE-----
MIIBqDCCAU2gAwIBAgIBADAKBggqhkjOPQQDAjA7MRwwGgYDVQQKExNkeW5hbWlj
bGlzdGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwHhcNMjIw
MjI0MDgwNjQ1WhcNMzIwMjIyMDgwNjQ1WjA7MRwwGgYDVQQKExNkeW5hbWljbGlz
dGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwWTATBgcqhkjO
PQIBBggqhkjOPQMBBwNCAARLB2/0vt9+n9goE/XKrDT60w6KLEgu5KCwkbfbph44
EL+Su+RxjS5BnCtG1xclbLDX1/+uOELKxapsbo7OGd/zo0IwQDAOBgNVHQ8BAf8E
BAMCAqQwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQUYq7xov/clVsjODNDzu3d
nAbbrYgwCgYIKoZIzj0EAwIDSQAwRgIhAPGlQjJ8vY5pqsQQuHWNu6Et6lqqrOiA
aZfUBJ+uHlC8AiEA7rqnq0bc1ZPmK7ygjxiL/KLbPCUaxCKqCrIz2Xrv+4g=
rancher@rancher-docker:~$ curl --insecure -fL https://192.168.105.200/v3/settings/cacerts
# from secret tls-rancher-ingress tls.crt
-----BEGIN CERTIFICATE-----
MIIBpjCCAU2gAwIBAgIBADAKBggqhkjOPQQDAjA7MRwwGgYDVQQKExNkeW5hbWlj
bGlzdGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwHhcNMjIw
MjI0MDgwNjQ1WhcNMzIwMjIyMDgwNjQ1WjA7MRwwGgYDVQQKExNkeW5hbWljbGlz
dGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwWTATBgcqhkjO
PQIBBggqhkjOPQMBBwNCAARjtDqGqCdf7zviIoN+pm5AJDIAzyu43r77yVykDjek
sLjIWrF4EFoFJX4y8mDbimCbqs8gI30gwDg34DWN2V2Jo0IwQDAOBgNVHQ8BAf8E
BAMCAqQwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQU/3xAlKB6NQ9iZVESnZpD
A/h85/8wCgYIKoZIzj0EAwIDRwAwRAIgR0fDy3RlHdAt5yqykVpY+d9Uu+iRMK6r
6XmO9mTT+C0CIHkWA3jc7iaPyBpwQ8XEaiJk/qNzds+yPz1Col0PsZN6
-----END CERTIFICATE-----

I tried to generate tls.key, tls.cer, and cacerts.pem and install rancher based on private CA and ingress.tls.source=secret option but got the cacerts checksum mismatch issue.

gpu@gpu:/data/nas-nfs/kubernetes/seunghyun/rancher/openssl$ openssl genrsa -out cacerts.pem 2048
Generating RSA private key, 2048 bit long modulus (2 primes)
.......................................+++++
...+++++
e is 65537 (0x010001)
gpu@gpu:/data/nas-nfs/kubernetes/seunghyun/rancher/openssl$ openssl genrsa -out tls.key 2048
Generating RSA private key, 2048 bit long modulus (2 primes)
...............................+++++
...................................................................................+++++
e is 65537 (0x010001)
gpu@gpu:/data/nas-nfs/kubernetes/seunghyun/rancher/openssl$ openssl req -x509 -new -nodes -days 365 -key tls.key -out tls.crt \
> -subj "/CN=rancher.my.org"
gpu@gpu:/data/nas-nfs/kubernetes/seunghyun/rancher/openssl$ ls
cacerts.pem  tls.crt  tls.key
helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --set hostname=rancher.my.org \
  --set ingress.tls.source=secret \
  --set privateCA=true
gpu@gpu:/data/nas-nfs/kubernetes/seunghyun/rancher/openssl$ kubectl -n cattle-system create secret tls tls-rancher-ingress \
> --cert=tls.crt \
> --key=tls.key
secret/tls-rancher-ingress created
gpu@gpu:/data/nas-nfs/kubernetes/seunghyun/rancher/openssl$ kubectl -n cattle-system create secret generic tls-ca \
> --from-file=cacerts.pem=./cacerts.pem
secret/tls-ca created

20220225
When I install rancher on a single node using docker, all certificates are matched.

https://192.168.103.205/cacerts
https://192.168.103.205/v3/settings/cacerts
# other node's RKE1 cluster's certificate
/etc/kubernetes/ssl/certs/serverca

When I compare helm rancher and docker rancher’s certificate, there is a difference.

docker rancher’s certificate are from kubernetes secret tls-rancher-ingress.
On the other hand, helm rancher’s certificate are from both kubernetes secret tls-rancher-ingress and tls-rancher-internal-ca

helm rancher

rancher@test-node:~$ sudo cat /etc/kubernetes/ssl/certs/serverca
# from secret tls-rancher-ingress tls.crt
rancher@test-node:~$ curl --insecure -fL https://192.168.105.200/cacerts
# from tls-rancher-internal-ca tls.crt
rancher@rancher-docker:~$ curl --insecure -fL https://192.168.105.200/v3/settings/cacerts
# from secret tls-rancher-ingress tls.crt
-----END CERTIFICATE-----

docker rancher

rancher@test-node:~$ sudo cat /etc/kubernetes/ssl/certs/serverca
# from secret tls-rancher-ingress tls.crt
rancher@test-node:~$ curl --insecure -fL https://192.168.103.205/cacerts
# from secret tls-rancher-ingress tls.crt
rancher@rancher-docker:~$ curl --insecure -fL https://192.168.103.205/v3/settings/cacerts
# from secret tls-rancher-ingress tls.crt
2 Likes

After replace tls-rancher-internal-ca’s tls.crt and tls.key from tls-rancher’s tls.crt and tls.key,
solve both issues below:

  • “Certificate chain is not complete, please check if all needed intermediate certificates are included in the server certificate (in the correct order) and if the cacerts setting in Rancher either contains the correct CA certificate (in the case of using self signed certificates) or is empty (in the case of using a certificate signed by a recognized CA). Certificate information is displayed above. error: Get “https://192.168.105.200\“: x509: certificate signed by unknown authority (possibly because of “x509: ECDSA verification failure” while trying to verify candidate authority certificate “dynamiclistener-ca”)” in RKE1 cluster
  • “Configured cacerts checksum (1382944946dbe8c6faf7d0bd6d33d6593f3416579e75efa6ad852c2e24453016) does not match given --ca-checksum (543edb437be8e3b68c60bb09fc27bde24f26ce62bec2e44e182681c2df6ed06b)” in RKE2 cluster
2 Likes

replace tls-rancher-internal-ca ’s tls.crt and tls.key from tls-rancher-ingress ’ tls.crt and tls.key not tls-rancher ’s tls.crt and tls.key

2 Likes

Hi Kosehy,
I got the same issue as you faced and I was stuck there for a week. Could you give me more detail how you replace the tls-rancher-internal-ca to get the problem solved?

1 Like

Dear dyllanwli
I’ve the same problem. Have you resolve it yet ?