Lh 0.7.0 + Rancher 2.2.8 + CoreOs hangs on attaching volumes

Michal · November 26, 2019, 10:09am

Hi All,

K8s v1.15.3

I tried to use the longhorn storage class to provision some storage for a POD.
Seem to be hanging on attaching the volume. Status flips between attaching detaching.

time="2019-11-26T10:03:14Z" level=warning msg="pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: time=\"2019-11-26T10:02:58Z\" level=info msg=\"Get backend tcp://10.42.3.134:10000 revision counter 0\""

11/26/2019 10:03:14 AM time="2019-11-26T10:03:14Z" level=warning msg="pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: time=\"2019-11-26T10:02:59Z\" level=info msg=\"New data socket connnection established\""

11/26/2019 10:03:14 AM time="2019-11-26T10:03:14Z" level=warning msg="pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: time=\"2019-11-26T10:03:14Z\" level=error msg=\"Shutting down frontend due to failed to callback launcher: failed to start frontend: failed to call gRPC FrontendStartCallback for engine pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: rpc error: code = Unknown desc = failed to callback for engine pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5 frontend start: Failed to execute: nsenter [--mount=/host/proc/744/ns/mnt --net=/host/proc/744/ns/net iscsiadm -m node -T iqn.2014-09.com.rancher:pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da -p 10.42.4.200 --login], output iscsiadm: No records found\\n, error exit status 21\""

11/26/2019 10:03:14 AM time="2019-11-26T10:03:14Z" level=warning msg="pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: 2019/11/26 10:03:14 failed to start up launcher: failed to start frontend: failed to call gRPC FrontendStartCallback for engine pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: rpc error: code = Unknown desc = failed to callback for engine pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5 frontend start: Failed to execute: nsenter [--mount=/host/proc/744/ns/mnt --net=/host/proc/744/ns/net iscsiadm -m node -T iqn.2014-09.com.rancher:pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da -p 10.42.4.200 --login], output iscsiadm: No records found"

11/26/2019 10:03:14 AM time="2019-11-26T10:03:14Z" level=warning msg="pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da-e-c25c8fb5: , error exit status 21"

11/26/2019 10:03:14 AM time="2019-11-26T10:03:14Z" level=warning msg="Engine of volume pvc-05d5c4fb-cf3b-4192-a1a6-207d8b6943da dead unexpectedly, reattach the volume"

Might be related to:

shuo-wu · November 26, 2019, 7:04pm

Could you provide the log of csi-attacher/csi-provisioner/longhorn-csi-plugin to see if there are any similar logs to that of the issue you mentioned?

Michal · November 27, 2019, 10:27am

No, csi logs seem ok
Workload: longhorn-csi-plugin

  I1125 16:40:16.222124 1 main.go:110] Version: v1.2.0-0-g6ef000ae

11/25/2019 4:40:16 PM I1125 16:40:16.222184 1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"

11/25/2019 4:40:16 PM I1125 16:40:16.222202 1 connection.go:151] Connecting to unix:///csi/csi.sock

11/25/2019 4:40:17 PM I1125 16:40:17.223102 1 main.go:127] Calling CSI driver to discover driver name

11/25/2019 4:40:17 PM I1125 16:40:17.223134 1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo

11/25/2019 4:40:17 PM I1125 16:40:17.223146 1 connection.go:181] GRPC request: {}

11/25/2019 4:40:17 PM I1125 16:40:17.226989 1 connection.go:183] GRPC response: {"name":"driver.longhorn.io","vendor_version":"v0.7.0"}

11/25/2019 4:40:17 PM I1125 16:40:17.227434 1 connection.go:184] GRPC error: <nil>

11/25/2019 4:40:17 PM I1125 16:40:17.227442 1 main.go:137] CSI driver name: "driver.longhorn.io"

11/25/2019 4:40:17 PM I1125 16:40:17.227483 1 node_register.go:58] Starting Registration Server at: /registration/driver.longhorn.io-reg.sock

11/25/2019 4:40:17 PM I1125 16:40:17.227684 1 node_register.go:67] Registration Server started at: /registration/driver.longhorn.io-reg.sock

11/25/2019 4:40:17 PM I1125 16:40:17.377049 1 main.go:77] Received GetInfo call: &InfoRequest{}

11/25/2019 4:40:17 PM I1125 16:40:17.441531 1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

shuo-wu · November 27, 2019, 11:19pm

How about the logs of csi-attacher and csi-provionser?

If possible, you can directly click Generate Support Bundle link at the bottom of the UI and send the bundle to longhorn-support-bundle@rancher.com.

Michal · December 9, 2019, 10:46am

Done. Did it like 10 days ago or so.

shuo-wu · December 10, 2019, 7:04pm

I cannot find the support bundle you send… Could you check and resend it?

shuo-wu · December 16, 2019, 7:45pm

It seems that there is something wrong with your iscsiadm configuration on the host nodes. Then the Longhorn engine cannot be launched.

The log of instance-manager-e-267e7e8b:

2019-12-13T10:18:44.421776548Z [pvc-1207e229-7320-4363-9bbd-a77fa03b568a-e-d7aef2b8-8e0c9333] time="2019-12-13T10:18:44Z" level=info msg="New data socket connnection established"
2019-12-13T10:18:45.47934259Z [longhorn-instance-manager] time="2019-12-13T10:18:45Z" level=info msg="wait for gRPC service of process pvc-1207e229-7320-4363-9bbd-a77fa03b568a-e-d7aef2b8-8e0c9333 to start"
2019-12-13T10:18:45.616282235Z [longhorn-instance-manager] time="2019-12-13T10:18:45Z" level=warning msg="FAIL to discover due to Failed to execute: nsenter [--mount=/host/proc/1535/ns/mnt --net=/host/proc/1535/ns/net iscsiadm -m discovery -t sendtargets -p 10.42.3.77], output iscsiadm: Cannot perform discovery. Invalid Initiatorname.\niscsiadm: Could not perform SendTargets discovery: invalid parameter\n, error exit status 7"

You can refer to this post to solve this issue. If it doesn’t work, feel free to ask for help.

Topic		Replies	Views
Unable to provision volumes in fresh install of Longhorn (v0.7) Longhorn	1	6272	January 6, 2020
Unable to attach Longhorn storage to single-cluster Rancher k8s node Longhorn	6	13719	October 4, 2020
Longhorn PVC Bound but associated PV Terminating Longhorn	3	238	April 29, 2024
Existent persistent volume not found by the container Rancher	2	785	February 18, 2022
Longhorn - workload pod moved - storage did not Longhorn	2	1527	July 1, 2021

Lh 0.7.0 + Rancher 2.2.8 + CoreOs hangs on attaching volumes

Related topics