Issue with rke up private keys

i currently get a:
INFO[0000] Running RKE version: v1.2.8
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates
INFO[0000] [certificates] Generating admin certificates and kubeconfig
INFO[0000] Successfully Deployed state file at [./cluster.rkestate]
INFO[0000] Building Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [10.222.0.40]
INFO[0000] [dialer] Setup tunnel for host [10.222.0.42]
INFO[0000] [dialer] Setup tunnel for host [10.222.0.41]
WARN[0000] Failed to set up SSH tunneling for host [10.222.0.40]: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Unable to access node with address [10.222.0.40:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
WARN[0000] Failed to set up SSH tunneling for host [10.222.0.41]: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Unable to access node with address [10.222.0.41:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
WARN[0000] Failed to set up SSH tunneling for host [10.222.0.42]: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Unable to access node with address [10.222.0.42:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
WARN[0000] Removing host [10.222.0.40] from node lists
WARN[0000] Removing host [10.222.0.41] from node lists
WARN[0000] Removing host [10.222.0.42] from node lists
FATA[0000] Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) [10.222.0.40]

config:

nodes:

  • address: 10.222.0.42
    port: “22”
    internal_address: 10.222.0.42
    role:
    • controlplane
    • worker
    • etcd
      user: fedora
      docker_socket: /var/run/docker.sock
      ssh_key_path: ~/.ssh/vsphere
  • address: 10.222.0.41
    port: “22”
    internal_address: 10.222.0.41
    role:
    • controlplane
    • worker
    • etcd
      user: fedora
      docker_socket: /run/docker.sock
      ssh_key_path: ~/.ssh/vsphere
  • address: 10.222.0.40
    port: “22”
    internal_address: 10.222.0.40
    role:
    • controlplane
    • worker
    • etcd
      user: fedora
      docker_socket: /run/docker.sock
      ssh_key_path: ~/.ssh/vsphere

but I can do:

$ ssh -i ~/.ssh/vsphere -L localhost:8888:/var/run/docker.sock fedora@10.222.0.40
Last login: Tue Jun 1 17:24:01 2021 from ::1

curl -v localhost:8888/info

  • Trying ::1…
  • TCP_NODELAY set
  • Connected to localhost (::1) port 8888 (#0)

GET /info HTTP/1.1
Host: localhost:8888
User-Agent: curl/7.64.1
Accept: /

< HTTP/1.1 200 OK
< Api-Version: 1.41
< Content-Type: application/json
< Docker-Experimental: false
< Ostype: linux
< Server: Docker/20.10.6 (linux)
< Date: Tue, 01 Jun 2021 13:34:42 GMT
< Transfer-Encoding: chunked
<
{“ID”:“YCHH:3SHM:CGUY:UA6G:IAFR:PIBX:Y3MW:NUIO:LHQ4:7YMP:GFPK:KTTQ”,“Containers”:0,“ContainersRunning”:0,“ContainersPaused”:0,“ContainersStopped”:0,“Images”:0,“Driver”:“overlay2”,“DriverStatus”:[[“Backing Filesystem”,“xfs”],[“Supports d_type”,“true”],[“Native Overlay Diff”,“true”],[“userxattr”,“false”]],“Plugins”:{“Volume”:[“local”],“Network”:[“bridge”,“host”,“ipvlan”,“macvlan”,“null”,“overlay”],“Authorization”:null,“Log”:[“awslogs”,“fluentd”,“gcplogs”,“gelf”,“journald”,“json-file”,“local”,“logentries”,“splunk”,“syslog”]},“MemoryLimit”:true,“SwapLimit”:true,“KernelMemory”:false,“KernelMemoryTCP”:false,“CpuCfsPeriod”:true,“CpuCfsQuota”:true,“CPUShares”:true,“CPUSet”:true,“PidsLimit”:true,“IPv4Forwarding”:true,“BridgeNfIptables”:true,“BridgeNfIp6tables”:true,“Debug”:false,“NFd”:24,“OomKillDisable”:false,“NGoroutines”:33,“SystemTime”:“2021-06-01T17:34:42.248812117+04:00”,“LoggingDriver”:“json-file”,“CgroupDriver”:“systemd”,“CgroupVersion”:“2”,“NEventsListener”:0,“KernelVersion”:“5.8.15-301.fc33.x86_64”,“OperatingSystem”:“Fedora 33 (Thirty Three)”,“OSVersion”:“33”,“OSType”:“linux”,“Architecture”:“x86_64”,“IndexServerAddress”:“https://index.docker.io/v1/“,“RegistryConfig”:{“AllowNondistributableArtifactsCIDRs”:[],“AllowNondistributableArtifactsHostnames”:[],“InsecureRegistryCIDRs”:[“127.0.0.0/8”],“IndexConfigs”:{“docker.io”:{“Name”:“docker.io”,“Mirrors”:[],“Secure”:true,“Official”:true}},“Mirrors”:[]},“NCPU”:2,“MemTotal”:8340643840,“GenericResources”:null,“DockerRootDir”:”/var/lib/docker",“HttpProxy”:“”,“HttpsProxy”:“”,“NoProxy”:“”,“Name”:“rke-7d5b-1”,“Labels”:[],“ExperimentalBuild”:false,“ServerVersion”:“20.10.6”,“Runtimes”:{“io.containerd.runc.v2”:{“path”:“runc”},“io.containerd.runtime.v1.linux”:{“path”:“runc”},“runc”:{“path”:“runc”}},“DefaultRuntime”:“runc”,“Swarm”:{“NodeID”:“”,“NodeAddr”:“”,“LocalNodeState”:“inactive”,“ControlAvailable”:false,“Error”:“”,“RemoteManagers”:null},“LiveRestoreEnabled”:false,“Isolation”:“”,“InitBinary”:“docker-init”,“ContainerdCommit”:{“ID”:“d71fcd7d8303cbf684402823e425e9dd2e99285d”,“Expected”:“d71fcd7d8303cbf684402823e425e9dd2e99285d”},“RuncCommit”:{“ID”:“b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7”,“Expected”:“b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7”},“InitCommit”:{“ID”:“de40ad0”,“Expected”:“de40ad0”},“SecurityOptions”:[“name=seccomp,profile=default”,“name=cgroupns”],"Warnings”:null}

  • Connection #0 to host localhost left intact
  • Closing connection 0

Requirements for RKE are documented on Rancher Docs: Requirements, can you share sshd version/SSH key type/sshd log output when you run rke up and when you run it manually on the CLI?

So this has now been fixed as sshd logs pointed to:

Jun  2 12:09:15 rke-7d5b-1 sshd[1053]: userauth_pubkey: key type ssh-rsa not in PubkeyAcceptedKeyTypes [preauth]
Jun  2 12:09:15 rke-7d5b-1 sshd[1053]: debug2: userauth_pubkey: authenticated 0 pkalg ssh-rsa [preauth]
Jun  2 12:09:15 rke-7d5b-1 sshd[1053]: debug3: user_specific_delay: user specific delay 0.000ms [preauth]
Jun  2 12:09:15 rke-7d5b-1 sshd[1053]: debug3: ensure_minimum_time_since: elapsed 0.069ms, delaying 8.764ms (requested 8.833ms) [preauth]
Jun  2 12:09:15 rke-7d5b-1 sshd[1053]: debug3: userauth_finish: failure partial=0 next methods="publickey,gssapi-keyex,gssapi-with-mic,password" [preauth]

So adding ssh-rsa to (/etc/crypto-policies/back-ends/opensshserver.conf) PubkeyAcceptedKeyTypes it does do the trick.

It seems that “golang.org/x/crypto/ssh” ssh.Client requires it. I wonder if initial options to diversify the accepted key types could facilitate this. Maybe here: rke/tunnel.go at b523c2b415121c30a0e63aeed71b6c61a75c116e · rancher/rke · GitHub

I know fedora and rke might be an awkward choice, but anyone using fedora vanilla will find this issue.

Anyway. it does work now.

Thanks

Just a further note. Ciphers, key signatures and algorithms are being dropped on Fedora 33 in line with decision of dropping support for TLS1.0 and TL1.1 and old SSH2 keys.

https://fedoraproject.org/wiki/Changes/StrongCryptoSettings2