Help on issue "[FATAL] k3s exited with: exit status 255"

I encountered an issue when installing rancher on single node according to online doc Rancher Docs: Installing Rancher on a Single Node Using Docker
I tried to start the container by command:

docker run -d --restart=unless-stopped \
  -p 80:80 -p 443:443 \
  --privileged \
  rancher/rancher:latest

the error logs I got from terminal

[root@j ~]# docker ps
CONTAINER ID   IMAGE                    COMMAND           CREATED          STATUS          PORTS                                                                      NAMES
ba140d50ae67   rancher/rancher:latest   "entrypoint.sh"   13 minutes ago   Up 16 seconds   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp   xenodochial_germain
[root@j ~]# docker logs -n 10  -f ba140d50ae67
raft2021/12/05 08:04:57 INFO: 8e9e05c52164694d became candidate at term 44
raft2021/12/05 08:04:57 INFO: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 44
raft2021/12/05 08:04:57 INFO: 8e9e05c52164694d became leader at term 44
raft2021/12/05 08:04:57 INFO: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 44
2021-12-05 08:04:57.125704 I | embed: ready to serve client requests
2021-12-05 08:04:57.125730 I | etcdserver: published {Name:default ClientURLs:[http://localhost:2379]} to cluster cdf818194e3a8c32
2021-12-05 08:04:57.126459 N | embed: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged!
2021/12/05 08:04:57 [INFO] Waiting for server to become available: Get "https://127.0.0.1:6443/version?timeout=15m0s": dial tcp 127.0.0.1:6443: connect: connection refused
exit status 255
2021/12/05 08:05:05 [FATAL] k3s exited with: exit status 255

I am using CentOS which is installed in VM of ESXi

[root@j ~]# cat /etc/redhat-release 
CentOS Linux release 8.5.2111
[root@j ~]# docker version
Client: Docker Engine - Community
 Version:           20.10.11
 API version:       1.41
 Go version:        go1.16.9
 Git commit:        dea9396
 Built:             Thu Nov 18 00:36:58 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.11
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.9
  Git commit:       847da18
  Built:            Thu Nov 18 00:35:20 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

has anybody encountered same issue? and what is the root cause and workaround?
Thanks

I think 6443 is the kube-apiserver, so if that’s refusing connections you won’t be getting anything. Why that’s happening in a Docker run version I’m less sure what to say about that (I only did the Docker version once to just look at the UI and then quit it).

Hi out there

Some changes in systemd prevent rancher container to start (exits 255)…

239-45.el8_4.3 = 0 problemo

systemd-239-51 = problemo mucho

http://rpm.pbone.net/changelog_idpl_76775267_com_systemd-239-51.el8.x86_64.rpm.html

Rancher V2.6.0

Docker version 20.10.11, build dea9396

containerd containerd.io 1.4.12 7b11cfaabd73bb80907dd23182b9347b4245eb5d

CentOS Linux 8

To get arround this while performing system-updates do the following:

[root@rancher-test-mgmt ~]# yum install ‘dnf-command(versionlock)’
Last metadata expiration check: 5:23:33 ago on Mon 06 Dec 2021 09:13:50 AM CET.
Dependencies resolved.

Package Architecture Version Repository Size

Installing:
python3-dnf-plugin-versionlock noarch 4.0.21-3.el8 baseos 62 k
Upgrading:
dnf-plugins-core noarch 4.0.21-3.el8 baseos 70 k
python3-dnf-plugins-core noarch 4.0.21-3.el8 baseos 234 k
yum-utils noarch 4.0.21-3.el8 baseos 73 k

Transaction Summary

Install 1 Package
Upgrade 3 Packages

Total download size: 439 k
Is this ok [y/N]: y
Downloading Packages:
(1/4): dnf-plugins-core-4.0.21-3.el8.noarch.rpm 5.4 MB/s | 70 kB 00:00
(2/4): yum-utils-4.0.21-3.el8.noarch.rpm 6.3 MB/s | 73 kB 00:00
(3/4): python3-dnf-plugin-versionlock-4.0.21-3.el8.noarch.rpm 1.9 MB/s | 62 kB 00:00
(4/4): python3-dnf-plugins-core-4.0.21-3.el8.noarch.rpm 5.8 MB/s | 234 kB 00:00

Total 1.6 MB/s | 439 kB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Upgrading : python3-dnf-plugins-core-4.0.21-3.el8.noarch 1/7
Upgrading : dnf-plugins-core-4.0.21-3.el8.noarch 2/7
Upgrading : yum-utils-4.0.21-3.el8.noarch 3/7
Installing : python3-dnf-plugin-versionlock-4.0.21-3.el8.noarch 4/7
Cleanup : yum-utils-4.0.18-4.el8.noarch 5/7
Cleanup : dnf-plugins-core-4.0.18-4.el8.noarch 6/7
Cleanup : python3-dnf-plugins-core-4.0.18-4.el8.noarch 7/7
Running scriptlet: python3-dnf-plugins-core-4.0.18-4.el8.noarch 7/7
Verifying : python3-dnf-plugin-versionlock-4.0.21-3.el8.noarch 1/7
Verifying : dnf-plugins-core-4.0.21-3.el8.noarch 2/7
Verifying : dnf-plugins-core-4.0.18-4.el8.noarch 3/7
Verifying : python3-dnf-plugins-core-4.0.21-3.el8.noarch 4/7
Verifying : python3-dnf-plugins-core-4.0.18-4.el8.noarch 5/7
Verifying : yum-utils-4.0.21-3.el8.noarch 6/7
Verifying : yum-utils-4.0.18-4.el8.noarch 7/7

Upgraded:
dnf-plugins-core-4.0.21-3.el8.noarch python3-dnf-plugins-core-4.0.21-3.el8.noarch yum-utils-4.0.21-3.el8.noarch
Installed:
python3-dnf-plugin-versionlock-4.0.21-3.el8.noarch

Complete!
[root@rancher-test-mgmt ~]# yum versionlock systemd*
Last metadata expiration check: 5:23:47 ago on Mon 06 Dec 2021 09:13:50 AM CET.
Adding versionlock on: systemd-pam-0:239-45.el8_4.3.*
Adding versionlock on: systemd-libs-0:239-45.el8_4.3.*
Adding versionlock on: systemd-0:239-45.el8_4.3.*
Adding versionlock on: systemd-udev-0:239-45.el8_4.3.*

The problem must come from one of the following changes in systemd:

  • Thu Jun 24 2021 systemd maintenance team - 239-48- cgroup: Also set io.bfq.weight (#1927290)- seccomp: allow turning off of seccomp filtering via env var (#1916835)- meson: remove strange dep that causes meson to enter infinite loop (#1970860)- copy: handle copy_file_range() weirdness on procfs/sysfs (#1970860)- core: Hide “Deactivated successfully” message (#1954802)- util: rework in_initrd() to make use of path_is_temporary_fs() (#1959339)- initrd: extend SYSTEMD_IN_INITRD to accept non-ramfs rootfs (#1959339)- initrd: do a debug log if failed to detect rootfs type (#1959339)- initrd: do a debug log if /etc/initrd-release doesn’t take effect (#1959339)- units: assign user-runtime-dirAATT.service to user-%i.slice (#1946453)- units: order user-runtime-dirAATT.service after systemd-user-sessions.service (#1946453)- units: make sure user-runtime-dirAATT.service is Type=oneshot (#1946453)- user-runtime-dir: downgrade a few log messages to LOG_DEBUG that we ignore (#1946453)- shared/install: Preserve escape characters for escaped unit names (#1952686)- basic/virt: Detect PowerVM hypervisor (#1937989)- man: document differences in clean exit status for Type=oneshot (#1940078)- busctl: add a timestamp to the output of the busctl monitor command (#1909214)- basic/cap-list: parse/print numerical capabilities (#1946943)- shared/mount-util: convert to libmount (#1885143)- mount-util: bind_remount: avoid calling statvfs (#1885143)- mount-util: use UMOUNT_NOFOLLOW in recursive umounter (#1885143)- test-install-root: create referenced targets (#1835351)- install: warn if WantedBy targets don’t exist (#1835351)- test-install-root: add test for unknown WantedBy= target (#1835351)- ceph is a network filesystem (#1952013)- sysctl: set kernel.core_pipe_limit=16 (#1949729)- core: don’t drop timer expired but not yet processed when system date is changed (#1899402)- core: Detect initial timer state from serialized data (#1899402)- rc-local: order after network-online.target (#1934028)- set core ulimit to 0 like on RHEL-7 (#1905582)- test-mountpointutil-util: do not assert in test_mnt_id() (#1910425)
  • Fri Jun 04 2021 Jan Macku - 239-47- systemd-binfmt: Add safeguard in triggers (#1787144)- spec: Requires(post) openssl-libs to fix missing /etc/machine-id (#1947438)- spec: Go back to using systemctl preset-all in post (#1783263, #1647172, #1118740)- spec: Disable libiptc support (#1817265)
  • Wed May 19 2021 systemd maintenance team - 239-46- Revert “udev: run link_update() with increased retry count in second invocation” (#1942299)- Revert “udev: make algorithm that selects highest priority devlink less susceptible to race conditions” (#1942299)- test/udev-test.pl: drop test cases that add mutliple devices (#1942299)

Thanks for your reply, I will look into the details of my environment.

i have the same problem. freshly installed and updated rocky OS.

uname -r
4.18.0-348.2.1.el8_5.x86_64

rancher/latest

[INFO] Waiting for server to become available: Get “https://127.0.0.1:6443/version?timeout=15m0s”: dial tcp 127.0.0.1:6443: connect: connection refused
exit status 255
2021/12/08 14:29:35 [FATAL] k3s exited with: exit status 255

Any hints?

@SaschaR - my only hint is to look farther up in the logs to see if you have another failure. Having kube-apiserver connection confused could happen for various reasons so the root cause may be in an error or warning farther up.

@wcoateRR
No, as @dre already pointed out, this is due to the kernel.
The latest kernel does not work.

The same fresh installation with kernel 4.18.0-305.7.1.el8_4.x86_64 works.

regards
Sascha

I used this hint to get it work on Rocky OS 8 with newest updates, slightly modified to this: linux kernel - Iptables v1.4.14: can't initialize iptables table `nat': Table does not exist (do you need to insmod?) - Stack Overflow

sudo modprobe ip_tables
sudo echo 'ip_tables' >> /etc/modules-load.d/rancher.conf
sudo reboot

regards
Sascha