Iscsiadm discovery fails Container-Linux RKE K8s (solved)

I have reached a dead end trying to demo Longhorn on my home cluster.

In Longhorn UI volume shows as attaching and pod using volume hangs.

I have Kubernetes 1.11.3 cluster setup via RKE running on Container-Linux 1911.4.0 (docker 18.06.1-ce) on 3 physical nodes (all of them with roles controlplane, etcd and worker) network is calico (default)

I have set up Longhorn via

kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3.2/deploy/longhorn.yaml
kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3.2/examples/pvc.yaml

EDIT: I have tried explicitly asking for flexvolume instead of csi and setting path for volumeplugins but errors are nearly the same.

Pod using provisioned volume hangs on

Readiness probe failed: ls: cannot access '/dev/longhorn/pvc-c46b43d9f67b11e8': No such file or directory

Container I guess responsible for attaching to containers shows pvc-c46b43d9f67b11e8-e-76318540

time="2018-12-02T21:47:02Z" level=info msg="launcher: controller a56fbf18-138a-4604-9aba-b6aacad0a9d9 started"
time="2018-12-02T21:47:02Z" level=info msg="Starting with replicas [\"tcp://10.42.1.23:9502\" \"tcp://10.42.0.19:9502\" \"tcp://10.42.2.27:9502\"]"
time="2018-12-02T21:47:02Z" level=info msg="Connecting to remote: 10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Opening: 10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Adding backend: tcp://10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Set replica tcp://10.42.1.23:9502 to mode RW"
time="2018-12-02T21:47:02Z" level=info msg="Connecting to remote: 10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Start monitoring tcp://10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Opening: 10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Adding backend: tcp://10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Set replica tcp://10.42.0.19:9502 to mode RW"
time="2018-12-02T21:47:02Z" level=info msg="Connecting to remote: 10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Start monitoring tcp://10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Opening: 10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Adding backend: tcp://10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Set replica tcp://10.42.2.27:9502 to mode RW"
time="2018-12-02T21:47:02Z" level=info msg="Start monitoring tcp://10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Get backend tcp://10.42.1.23:9502 revision counter 0"
time="2018-12-02T21:47:02Z" level=info msg="Get backend tcp://10.42.0.19:9502 revision counter 0"
time="2018-12-02T21:47:02Z" level=info msg="Get backend tcp://10.42.2.27:9502 revision counter 0"
time="2018-12-02T21:47:03Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T21:47:03Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T21:47:04Z" level=info msg="New data socket connnection established"
time="2018-12-02T21:47:05Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:05Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:08Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:08Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:11Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:11Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:14Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:14Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:17Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:17Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:20Z" level=fatal msg="Failed to start up frontend: failed to start frontend: Failed to execute: longhorn-engine-launcher [--url 0.0.0.0:9510 frontend-start --id a56fbf18-138a-4604-9aba-b6aacad0a9d9], output time=\"2018-12-02T21:47:20Z\" level=fatal msg=\"Error running frontend-start command: failed to start frontend: rpc error: code = Unknown desc = Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m node -T iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8 -p 10.42.1.24 --login], output iscsiadm: No records found\\n, error exit status 21.\"\n, error exit status 1"

EDIT: Errors from flexvolume try

time="2018-12-02T22:13:37Z" level=info     msg="launcher: controller 25951b2f-4467-4ad2-9273-49322ee5a680 started"
time="2018-12-02T22:13:37Z" level=info msg="Starting with replicas [\"tcp://10.42.1.31:9502\" \"tcp://10.42.0.25:9502\" \"tcp://10.42.2.34:9502\"]"
time="2018-12-02T22:13:37Z" level=info msg="Connecting to remote: 10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Opening: 10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Adding backend: tcp://10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Set replica tcp://10.42.1.31:9502 to mode RW"
time="2018-12-02T22:13:37Z" level=info msg="Connecting to remote: 10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Start monitoring tcp://10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Opening: 10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Adding backend: tcp://10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Set replica tcp://10.42.0.25:9502 to mode RW"
time="2018-12-02T22:13:37Z" level=info msg="Start monitoring tcp://10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Connecting to remote: 10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Opening: 10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Adding backend: tcp://10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Set replica tcp://10.42.2.34:9502 to mode RW"
time="2018-12-02T22:13:37Z" level=info msg="Start monitoring tcp://10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Get backend tcp://10.42.1.31:9502 revision counter 0"
time="2018-12-02T22:13:37Z" level=info msg="Get backend tcp://10.42.0.25:9502 revision counter 0"
time="2018-12-02T22:13:37Z" level=info msg="Get backend tcp://10.42.2.34:9502 revision counter 0"
time="2018-12-02T22:13:38Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T22:13:38Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T22:13:39Z" level=info msg="New data socket connnection established"
time="2018-12-02T22:13:40Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:40Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:43Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:43Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:46Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:46Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:49Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:49Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:52Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:52Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:55Z" level=fatal msg="Failed to start up frontend: failed to start frontend: Failed to execute: longhorn-engine-launcher [--url 0.0.0.0:9510 frontend-start --id 25951b2f-4467-4ad2-9273-49322ee5a680], output time=\"2018-12-02T22:13:55Z\" level=fatal msg=\"Error running frontend-start command: failed to start frontend: rpc error: code = Unknown desc = Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m node -T iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f -p 10.42.2.35 --login], output iscsiadm: No records found\\n, error exit status 21.\"\n, error exit status 1"

Looks like the iscsiadm on your node wasn’t setup correctly. It’s not usable by Longhorn.

The error message reported by iscsiadm is:

iscsiadm: Cannot perform discovery. Invalid Initiatorname.
iscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7

You should able to reproduce it on your node by trying to use iscsiadm to connect to any iscsi target.

The error Invalid Initiatorname lead me to https://www.linuxquestions.org/questions/linux-server-73/iscsi-intiator-not-connecting-4175496417/ , maybe you can give it a try.

1 Like

Thanks!

I solved it by running on every node. Because initiatorname.iscsi was missing.

/sbin/iscsi-iname -p "InitiatorName=iqn.2005-03.org.open-iscsi" > /etc/iscsi/initiatorname.iscsi

Now I am off to figure out :slight_smile:

MountVolume.MountDevice failed for volume "pvc-3ba96732f74711e8" : rpc error: code = Unavailable desc = grpc: the connection is unavailable

I think this error is related to the CSI directory. Probably the path was changed in Container-Linux, e.g. the location of kubelet root. Probably this can help https://kubernetes-csi.github.io/docs/CSI-Kubernetes.html

Flexvolume works.

You are somewhat right csi.sock is on host in /var/lib/kubelet/plugins/io.rancher.longhorn/ (not mounted to kubelet) and not in /var/lib/kubelet/volumeplugins/ (mounted to kubelet).
IMHO it is not due to Container-Linux just RKE does not mount this directory by default to kubelet. I will try to figure out where to add another kubelet volume mount in RKE.

Solved image

Thanks for guidance and help!

RKE’s cluster.yml needs kubectl mount for CNI to work

services:
  kubelet:
    extra_binds:
      - /var/lib/kubelet/plugins:/var/lib/kubelet/plugins:shared,z