Node does not exist

CrankyCoder · July 6, 2021, 11:05pm

Created a new pvc today. first one since upgrading to 1.1.1. Not sure if it’s related (would lean more to NOT being related but figured i would mention it.)

Warning  FailedAttachVolume  <invalid> (x11 over 6m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-22cf1f37-f0f2-4203-9918-904816a80841" : rpc error: code = NotFound desc = ControllerPublishVolume: the node kube-1 does not exist

I have tried deleting the pod and tried to get it to run somewhere else. Same thing for kube-2 and kube-5.

The longhorn UI shows all the nodes even the ones it says are not found.

Has anyone seen this?

c3y1huang · July 7, 2021, 2:00am

Hmm the error doesn’t look familiar. It could just be temporarily blocked. If you can provide us your support bundle then we can have a better idea of what is happening.

You can attach the support bundle here or send it to longhorn-support-bundle@Suse.com with the issue number - 20666.

CrankyCoder · July 7, 2021, 3:06pm

I have sent over support bundle. I saw this happen with a couple nodes. So not sure if I have something SERIOUSLY wrong in the underlying cluster or what. Seems to only be happening with this new storage request. All my existing ones seems to be fine.

c3y1huang · July 8, 2021, 10:38am

Thank you @CrankyCoder , we’ve received it. Can you also help to create a bug report with the reproducing steps and environment info?

CrankyCoder · July 8, 2021, 12:44pm

I can, but I don’t know how it manifested to begin with. So I am unsure if it’s a bug or something wrong with my cluster or what.

c3y1huang · July 9, 2021, 1:26am

Is it reproducible on a new cluster? Are you able to add a new node and a volume to attach to the new node to see if it’s schedulable?

We also saw quite of logs coming from the metrics server. Are you able to disable it and see if that makes a difference?

2021-07-07T15:02:08.240143213Z time="2021-07-07T15:02:08Z" level=warning msg="error during scrape" collector=node error="the server could not find the requested resource (get nodes.metrics.k8s.io)" node=kube-1

CrankyCoder · July 12, 2021, 6:16pm

I am building a new cluster starting today. Longhorn will be the first “workload” to go into it as soon as it’s online.

BBBeggar · February 3, 2023, 8:05am

Hello there!
I encountered the same issue that had been solved by redeploying a bit later. So I cannot provide more information on the topic but just wanted to check if you were able to solve this on your side as well eventually ?
Cheers

weizhe0422 · February 21, 2023, 5:31pm

What version of Longhorn are you using?

The ControllerPublishVolume means sends the actual attach request to the longhorn api.

Could you run kubectl get csinodes to see if the target node is in the list?

allenf · December 13, 2023, 10:04pm

Im having the same issues

I Created a new bug here [BUG] LongHorn is not detecting any nodes after 16 · Issue #7329 · longhorn/longhorn · GitHub

i can not afford to recreate a new cluster as most of my pods in this are Production… But i really need answers as i can not complete the setup

PhanLe0110 · December 19, 2023, 3:34am

we are follow up your report in the github ticket

Topic		Replies	Views
Longhorn - workload pod moved - storage did not Longhorn	2	1527	July 1, 2021
Node Scheduling with Longhorn problems Longhorn	5	1571	September 4, 2022
Longhorn PVC failed to switch to different pod, once pod instance died Longhorn	3	3848	September 4, 2019
PVC mount fails Longhorn	0	449	December 2, 2023
Rancher 2.4.5 - Longhorn volume attach errors Rancher	0	761	July 7, 2020

Node does not exist

Related topics