Problem adding AKS cluster to Rancher

We are trying to add AKS cluster to Rancher and having some problem with it. Here is the Slack post with the full video recording that reproduces it in real time: Slack

As you can see Rancher pod crashes in several minutes after we try to add AKS cluster using Rancher UI. If Rancher is deployed in HA mode - all pods eventually die and are getting into the crash loop.

We can provide any info on demand to help to resolve this. Any pointer for where to look/zoom in would be greatly appreciated.

If we are doing anything wrong/stupid here (we never used AKS cluster with Rancher running in AKS before) - please let us know.

We missed the error log in the aks-config-operator logs, looks like this:

Doing /etc/rancher/ssl
W0712 18:30:13.902837      10 loader.go:221] Config not found: /home/aks-operator/.kube/config
time="2022-07-12T18:30:14Z" level=info msg="Starting aks.cattle.io/v1, Kind=AKSClusterConfig controller"
time="2022-07-12T18:30:14Z" level=info msg="Starting /v1, Kind=Secret controller"
time="2022-07-12T18:30:14Z" level=info msg="Checking configuration for cluster [aksnonprod]"
E0712 18:30:15.575463      10 runtime.go:78] Observed a panic: runtime.boundsError{x:1, y:0, signed:true, code:0x0} (runtime error: index out of range [1] with length 0)
goroutine 252 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x14c27a0, 0xc0008a6090)
	/go/pkg/mod/k8s.io/apimachinery@v0.21.2/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/apimachinery@v0.21.2/pkg/util/runtime/runtime.go:48 +0x86
panic(0x14c27a0, 0xc0008a6090)
	/usr/local/go/src/runtime/panic.go:965 +0x1b9
github.com/rancher/aks-operator/controller.BuildUpstreamClusterState(0x1760658, 0xc000875e40, 0x17609d8, 0xc0003ce0c0, 0x7f25ec64eff0, 0xc000345f80, 0xc000396118, 0xd18c2e2800, 0x3, 0x6fc23ac00)
	/go/src/github.com/rancher/aks-operator/controller/aks-cluster-config-handler.go:674 +0x131f
github.com/rancher/aks-operator/controller.(*Handler).checkAndUpdate(0xc0003271c0, 0xc000396000, 0x0, 0x0, 0x0)
	/go/src/github.com/rancher/aks-operator/controller/aks-cluster-config-handler.go:339 +0x55c
github.com/rancher/aks-operator/controller.(*Handler).OnAksConfigChanged(0xc0003271c0, 0xc000040200, 0x1a, 0xc000396000, 0xc0002b08a0, 0xc0ab8d4d8b16b31f, 0x11b0ace1)
	/go/src/github.com/rancher/aks-operator/controller/aks-cluster-config-handler.go:121 +0x9d
github.com/rancher/aks-operator/controller.(*Handler).recordError.func1(0xc000040200, 0x1a, 0xc000396000, 0x14eb180, 0xeca0c0, 0x12cb0c0)
	/go/src/github.com/rancher/aks-operator/controller/aks-cluster-config-handler.go:171 +0x67
github.com/rancher/aks-operator/pkg/generated/controllers/aks.cattle.io/v1.FromAKSClusterConfigHandlerToHandler.func1(0xc000040200, 0x1a, 0x1739d68, 0xc000396000, 0xe0, 0xf, 0x14eb180, 0x40a84c)
	/go/src/github.com/rancher/aks-operator/pkg/generated/controllers/aks.cattle.io/v1/aksclusterconfig.go:105 +0x6b
github.com/rancher/lasso/pkg/controller.SharedControllerHandlerFunc.OnChange(0xc000329820, 0xc000040200, 0x1a, 0x1739d68, 0xc000396000, 0x1a, 0xc000026c01, 0x40a56c, 0x2030288)
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/sharedcontroller.go:29 +0x4e
github.com/rancher/lasso/pkg/controller.(*SharedHandler).OnChange(0xc000327140, 0xc000040200, 0x1a, 0x1739d68, 0xc000396000, 0xc000915d01, 0x0)
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/sharedhandler.go:69 +0x14c
github.com/rancher/lasso/pkg/controller.(*controller).syncHandler(0xc0000e09a0, 0xc000040200, 0x1a, 0xc000915e58, 0x4)
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/controller.go:215 +0xd1
github.com/rancher/lasso/pkg/controller.(*controller).processSingleItem(0xc0000e09a0, 0x13ce1e0, 0xc00085c200, 0x0, 0x0)
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/controller.go:197 +0xe7
github.com/rancher/lasso/pkg/controller.(*controller).processNextWorkItem(0xc0000e09a0, 0x203000)
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/controller.go:174 +0x54
github.com/rancher/lasso/pkg/controller.(*controller).runWorker(...)
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/controller.go:163
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00085c2c0)
	/go/pkg/mod/k8s.io/apimachinery@v0.21.2/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00085c2c0, 0x1734000, 0xc0002b0930, 0x1, 0xc00008b080)
	/go/pkg/mod/k8s.io/apimachinery@v0.21.2/pkg/util/wait/wait.go:156 +0x9b
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00085c2c0, 0x3b9aca00, 0x0, 0xc00051fc01, 0xc00008b080)
	/go/pkg/mod/k8s.io/apimachinery@v0.21.2/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00085c2c0, 0x3b9aca00, 0xc00008b080)
	/go/pkg/mod/k8s.io/apimachinery@v0.21.2/pkg/util/wait/wait.go:90 +0x4d
created by github.com/rancher/lasso/pkg/controller.(*controller).run
	/go/pkg/mod/github.com/rancher/lasso@v0.0.0-20210616224652-fc3ebd901c08/pkg/controller/controller.go:134 +0x33b
panic: runtime error: index out of range [1] with length 0 [recovered]
	panic: runtime error: index out of range [1] with length 0

It looks like /home/aks-operator/.kube/config is not in place. Is this a bug i.e. config should be created automatically or we are missing something?

Issue created Rancher 2.2.6 aks-config-operator /home/aks-operator/.kube/config not found · Issue #38285 · rancher/rancher · GitHub - no response yet :frowning:

At this point I’m wondering if anybody successfully added any Azure AKS cluster to Rancher recently. What versions are working for you?

Other people are seeing the same crash stack trace:

Something seems to be broken, any help/workaround from the Rancher team will be appreciated.