I deployed Longhorn 1.8.0 on Talos Linux 1.9.2 using Helm and for the most part, default options. I’m using the v2 engine and I have 3 dedicated storage nodes and followed the Talos specific setup for the bind mounts, huge pages and kernel modules. After deploying the Helm chart, I noticed instance-manager-*
pods failing to start on the other non storage worker nodes and complaining that there wasn’t enough huge pages available.
I then tried labeling my 3 storage nodes and updating the Helm chart by adding nodeSelector for longhornManager, longhornUI, longhornDriver, and global sections. However, I still see instance-manager-* nodes on other workers.
Finally I tried setting “System Managed Components Node Selector” to the node label, but this prevented me from attaching PVCs to non storage worker nodes so I removed it.
For now, I added huge pages to the remaining worker nodes and Longhorn is working with the v2 engine. I checked the documentation, but I don’t understand what instance-manager pods actually do and if it’s necessary for them to run on all worker nodes.
Another question is why the instance-manager pods use about 1cpu and a little bit of RAM on the non storage worker nodes, but they utilize 0/0 for the actual storage nodes. Can someone help me understand what these pods actually do and why I can’t just run them on storage nodes only?