RKE Cluster setup failed

Hi everyone! Unfortunately I’ve tried to setup my cluster on DigitalOcean first time using Rancher rke (just wanted to try) and dissapointed. I spend almost 2 hours with no result. Please help me to realize what’s a problem with rke.
So I have rke v1.1.0 (but I tried to use older versions - same result).

Here’s example of my cluster.yml:

nodes:

  • address: 10.133.1.xx
    port: “22”
    internal_address: 10.133.1.xx
    role:
    • controlplane
      hostname_override: k8s-master
      user: rke
      docker_socket: /var/run/docker.sock
      ssh_key: “”
      ssh_key_path: ~/.ssh/id_rsa
      ssh_cert: “”
      ssh_cert_path: “”
      labels: {}
      taints:
  • address: 10.133.47.xx
    port: “22”
    internal_address: 10.133.47.xx
    role:
    • worker
    • etcd
      hostname_override: k8s-worker1
      user: rke
      docker_socket: /var/run/docker.sock
      ssh_key: “”
      ssh_key_path: ~/.ssh/id_rsa
      ssh_cert: “”
      ssh_cert_path: “”
      labels: {}
      taints:
      services:
      etcd:
      image: “”
      extra_args: {}
      extra_binds:
      extra_env:
      external_urls:
      ca_cert: “”
      cert: “”
      key: “”
      path: “”
      uid: 0
      gid: 0
      snapshot: null
      retention: “”
      creation: “”
      backup_config: null
      kube-api:
      image: “”
      extra_args: {}
      extra_binds:
      extra_env:
      service_cluster_ip_range: 10.43.0.0/16
      service_node_port_range: “”
      pod_security_policy: false
      always_pull_images: false
      secrets_encryption_config: null
      audit_log: null
      admission_configuration: null
      event_rate_limit: null
      kube-controller:
      image: “”
      extra_args: {}
      extra_binds:
      extra_env:
      cluster_cidr: 10.42.0.0/16
      service_cluster_ip_range: 10.43.0.0/16
      scheduler:
      image: “”
      extra_args: {}
      extra_binds:
      extra_env:
      kubelet:
      image: “”
      extra_args: {}
      extra_binds:
      extra_env:
      cluster_domain: cluster.local
      infra_container_image: “”
      cluster_dns_server: 10.43.0.10
      fail_swap_on: false
      generate_serving_certificate: false
      kubeproxy:
      image: “”
      extra_args: {}
      extra_binds:
      extra_env:

In general nothing special, but all the time I get an error on master from worker:

FATA[0000] [Failed to start [rke-etcd-port-listener] container on host [10.133.47.xx]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (7afc0c060f6ec0b6fd000d99b1538a77e46c6a12497c326553ea8b30af7b1be8): Bind for 0.0.0.0:2380 failed: port is already allocated]

This container created from image - rancher/rke-tools:v0.1.56
893b62146ebe rancher/rke-tools:v0.1.56 "nc -kl -p 1337 -e e…" 8 minutes ago Created rke-etcd-port-listener

So at the end I have on worker:

  • 893b62146ebe rancher/rke-tools:v0.1.56 "nc -kl -p 1337 -e e…" 11 minutes ago Created rke-etcd-port-listener
  • 6ebc23ccb813 rancher/rke-tools:v0.1.56 "/bin/bash" 11 minutes ago Exited (0) 11 minutes ago cluster-state-deployer

I’ve researched solutions like this for example and restarted Docker with iptables clean up rules and chains. Same result.

When I try to start rke-etcd-port-listener by hands I get this error:
Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (b30d465d7f467969e61542948d612b5199eb65310477d56574db730b77709591): Bind for 0.0.0.0:2380 failed: port is already allocated Error: failed to start containers: 893b62146ebe

Looks like rke up simply doesn’t work from scratch. Don’t have any other options to try…

I think there might be a formatting or indentation issue with your cluster.yml. I noticed that our RKE example cluster.yml file doesn’t have quotation marks around the port, but your cluster.yml does have the quotation marks. https://rancher.com/docs/rke/latest/en/example-yamls/

Hm… Interesting. Thank you for an answer, I will check this