Iscsiadm discovery fails Container-Linux RKE K8s (solved)


#1

I have reached a dead end trying to demo Longhorn on my home cluster.

In Longhorn UI volume shows as attaching and pod using volume hangs.

I have Kubernetes 1.11.3 cluster setup via RKE running on Container-Linux 1911.4.0 (docker 18.06.1-ce) on 3 physical nodes (all of them with roles controlplane, etcd and worker) network is calico (default)

I have set up Longhorn via

kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3.2/deploy/longhorn.yaml
kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3.2/examples/pvc.yaml

EDIT: I have tried explicitly asking for flexvolume instead of csi and setting path for volumeplugins but errors are nearly the same.

Pod using provisioned volume hangs on

Readiness probe failed: ls: cannot access '/dev/longhorn/pvc-c46b43d9f67b11e8': No such file or directory

Container I guess responsible for attaching to containers shows pvc-c46b43d9f67b11e8-e-76318540

time="2018-12-02T21:47:02Z" level=info msg="launcher: controller a56fbf18-138a-4604-9aba-b6aacad0a9d9 started"
time="2018-12-02T21:47:02Z" level=info msg="Starting with replicas [\"tcp://10.42.1.23:9502\" \"tcp://10.42.0.19:9502\" \"tcp://10.42.2.27:9502\"]"
time="2018-12-02T21:47:02Z" level=info msg="Connecting to remote: 10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Opening: 10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Adding backend: tcp://10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Set replica tcp://10.42.1.23:9502 to mode RW"
time="2018-12-02T21:47:02Z" level=info msg="Connecting to remote: 10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Start monitoring tcp://10.42.1.23:9502"
time="2018-12-02T21:47:02Z" level=info msg="Opening: 10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Adding backend: tcp://10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Set replica tcp://10.42.0.19:9502 to mode RW"
time="2018-12-02T21:47:02Z" level=info msg="Connecting to remote: 10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Start monitoring tcp://10.42.0.19:9502"
time="2018-12-02T21:47:02Z" level=info msg="Opening: 10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Adding backend: tcp://10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Set replica tcp://10.42.2.27:9502 to mode RW"
time="2018-12-02T21:47:02Z" level=info msg="Start monitoring tcp://10.42.2.27:9502"
time="2018-12-02T21:47:02Z" level=info msg="Get backend tcp://10.42.1.23:9502 revision counter 0"
time="2018-12-02T21:47:02Z" level=info msg="Get backend tcp://10.42.0.19:9502 revision counter 0"
time="2018-12-02T21:47:02Z" level=info msg="Get backend tcp://10.42.2.27:9502 revision counter 0"
time="2018-12-02T21:47:03Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T21:47:03Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T21:47:04Z" level=info msg="New data socket connnection established"
time="2018-12-02T21:47:05Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:05Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:08Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:08Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:11Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:11Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:14Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:14Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:17Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.1.24], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T21:47:17Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8"
time="2018-12-02T21:47:20Z" level=fatal msg="Failed to start up frontend: failed to start frontend: Failed to execute: longhorn-engine-launcher [--url 0.0.0.0:9510 frontend-start --id a56fbf18-138a-4604-9aba-b6aacad0a9d9], output time=\"2018-12-02T21:47:20Z\" level=fatal msg=\"Error running frontend-start command: failed to start frontend: rpc error: code = Unknown desc = Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m node -T iqn.2014-09.com.rancher:pvc-c46b43d9f67b11e8 -p 10.42.1.24 --login], output iscsiadm: No records found\\n, error exit status 21.\"\n, error exit status 1"

EDIT: Errors from flexvolume try

time="2018-12-02T22:13:37Z" level=info     msg="launcher: controller 25951b2f-4467-4ad2-9273-49322ee5a680 started"
time="2018-12-02T22:13:37Z" level=info msg="Starting with replicas [\"tcp://10.42.1.31:9502\" \"tcp://10.42.0.25:9502\" \"tcp://10.42.2.34:9502\"]"
time="2018-12-02T22:13:37Z" level=info msg="Connecting to remote: 10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Opening: 10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Adding backend: tcp://10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Set replica tcp://10.42.1.31:9502 to mode RW"
time="2018-12-02T22:13:37Z" level=info msg="Connecting to remote: 10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Start monitoring tcp://10.42.1.31:9502"
time="2018-12-02T22:13:37Z" level=info msg="Opening: 10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Adding backend: tcp://10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Set replica tcp://10.42.0.25:9502 to mode RW"
time="2018-12-02T22:13:37Z" level=info msg="Start monitoring tcp://10.42.0.25:9502"
time="2018-12-02T22:13:37Z" level=info msg="Connecting to remote: 10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Opening: 10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Adding backend: tcp://10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Set replica tcp://10.42.2.34:9502 to mode RW"
time="2018-12-02T22:13:37Z" level=info msg="Start monitoring tcp://10.42.2.34:9502"
time="2018-12-02T22:13:37Z" level=info msg="Get backend tcp://10.42.1.31:9502 revision counter 0"
time="2018-12-02T22:13:37Z" level=info msg="Get backend tcp://10.42.0.25:9502 revision counter 0"
time="2018-12-02T22:13:37Z" level=info msg="Get backend tcp://10.42.2.34:9502 revision counter 0"
time="2018-12-02T22:13:38Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T22:13:38Z" level=warning msg="Failed to find dockerd in the process ancestors, fall back to use pid 1: Failed to find the ancestor process: dockerd"
time="2018-12-02T22:13:39Z" level=info msg="New data socket connnection established"
time="2018-12-02T22:13:40Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:40Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:43Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:43Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:46Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:46Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:49Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:49Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:52Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.2.35], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"
time="2018-12-02T22:13:52Z" level=warning msg="Nodes cleaned up for iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f"
time="2018-12-02T22:13:55Z" level=fatal msg="Failed to start up frontend: failed to start frontend: Failed to execute: longhorn-engine-launcher [--url 0.0.0.0:9510 frontend-start --id 25951b2f-4467-4ad2-9273-49322ee5a680], output time=\"2018-12-02T22:13:55Z\" level=fatal msg=\"Error running frontend-start command: failed to start frontend: rpc error: code = Unknown desc = Failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm -m node -T iqn.2014-09.com.rancher:pvc-7ec6992e-f67f-11e8-ae3b-00012e58aa1f -p 10.42.2.35 --login], output iscsiadm: No records found\\n, error exit status 21.\"\n, error exit status 1"

#2

Looks like the iscsiadm on your node wasn’t setup correctly. It’s not usable by Longhorn.

The error message reported by iscsiadm is:

iscsiadm: Cannot perform discovery. Invalid Initiatorname.
iscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7

You should able to reproduce it on your node by trying to use iscsiadm to connect to any iscsi target.

The error Invalid Initiatorname lead me to https://www.linuxquestions.org/questions/linux-server-73/iscsi-intiator-not-connecting-4175496417/ , maybe you can give it a try.


#3

Thanks!

I solved it by running on every node. Because initiatorname.iscsi was missing.

/sbin/iscsi-iname -p "InitiatorName=iqn.2005-03.org.open-iscsi" > /etc/iscsi/initiatorname.iscsi

Now I am off to figure out :slight_smile:

MountVolume.MountDevice failed for volume "pvc-3ba96732f74711e8" : rpc error: code = Unavailable desc = grpc: the connection is unavailable

#4

I think this error is related to the CSI directory. Probably the path was changed in Container-Linux, e.g. the location of kubelet root. Probably this can help https://kubernetes-csi.github.io/docs/CSI-Kubernetes.html


#5

Flexvolume works.

You are somewhat right csi.sock is on host in /var/lib/kubelet/plugins/io.rancher.longhorn/ (not mounted to kubelet) and not in /var/lib/kubelet/volumeplugins/ (mounted to kubelet).
IMHO it is not due to Container-Linux just RKE does not mount this directory by default to kubelet. I will try to figure out where to add another kubelet volume mount in RKE.


#6

Solved image

Thanks for guidance and help!

RKE’s cluster.yml needs kubectl mount for CNI to work

services:
  kubelet:
    extra_binds:
      - /var/lib/kubelet/plugins:/var/lib/kubelet/plugins:shared,z