Environment
I have a HA Rancher v2.6.3 cluster which is being Reverse Proxied by Traefik which is handling cert termination and routes to 3 nodes of the etc/cp/worker Rancher install. This is all running out of Proxmox VMs. My cert was issued by letsencrypt.
I recently added a bare metal install of Harvester v1.0.0.
Currently a single machine with all traffic being handled over the management NIC, which is Gb to the switch.
This Harvester node was successfully integrated into my Rancher cluster, with the intent of spinning up new clusters on demand.
Harvester Setup
I set up a network with vlanid 1, and just gave it access to my whole /16 192.168.0.0 home network. I have done this through DHCP as well as setting the route statically in Harvester.
Problem
When provisioning a cluster through Harvester (RKE1/RKE2/K3s), the VMs aren’t created and I get the below error:
failing bootstrap machine(s) k3s-pool1-65b4c9d49c-qv42q: failed creating server (HarvesterMachine) in infrastructure provider: CreateError: Failure detected from referenced resource rke-machine.cattle.io/v1, Kind=HarvesterMachine with name “k3s-pool1-d7010ffa-52wvw”: Downloading driver from {{redacted because too many links in post}}/harvester-node-driver/v0.3.4/docker-machine-driver-harvester-amd64.tar.gz
docker-machine-driver-harvester-amd64.tar.gz
docker-machine-driver-harvester-amd64.tar.gz: gzip compressed data, from Unix, original size 36115968
Running pre-create checks…
Error with pre-create check: “Get "{{redacted because too many links in post}}/k8s/clusters/c-m-zrshdjzq/apis/harvesterhci.io/v1beta1/settings/server-version": x509: certificate signed by unknown authority”
The default lines below are for a sh/bash shell, you can specify the shell you’re using, with the --shell flag. and join url to be available on bootstrap node
Troubleshooting
Following these steps: Rancher Helm Chart Options | Rancher
I set the additionalTrustedCAs flag in my Rancher helm deploy to true
helm upgrade rancher rancher-latest/rancher \
–namespace cattle-system
–set hostname=rancher.plmr.cloud
–set additionalTrustedCAs=true
I uploaded the ca-additional.pem as a tls-ca-addtional secret. The file I used as the ca to trust this cert chain was obtained by opening my cert in Firefox, and downloading the PEM (cert) for ISRG Root X1.
I also set the Harvester setting “additional-ca” to this same ISRG Root X1 pem file, which in my understanding should be operating as the root CA to trust all letsencrypt issued certs
Ask
Can anyone help me understand whether this issue is because I have something misconfigured in Rancher, or in Harvester? It’s unclear to me whether the Provisioning Log error above is from Rancher attempting to call itself, or if it’s coming back from Harvester. I don’t have another cloud provider configured to do side by side testing.
I was able to spin up a regular workload cluster in Proxmox VMs and did not encounter this issue when registering them to Rancher using the provided Registration Command from Rancher.
If I need to move this to the Harvester section, just let me know.