Issues with VmwarevSphere driver deploying K8s clusters onto VMware in enterprise environment

Hi,

I’ve just started to explore using Rancher to deploy k8s clusters onto VMware.

I’ve hit a couple of issues around vSphere DRS clusters and storage cluster SDRS.

If in the deployment target I target a vSphere DRS cluster with a single host - the deployment works as expected.

If I target a vSphere cluster with DRS enabled (with multiple hosts), then the deployment fails and the node logs report;
Error with pre-create check: “path ‘Dev/*’ resolves to multiple hosts”

Likewise our storage consists of a number of 5TB LUNs on an all flash array, with LUNs being grouped together under SDRS storage clusters.

When I target the SDRS cluster name, the deployment fails.
When I move a LUN out of the storage cluster to be a top level LUN (under VMware), I’m then able to target a specific LUN.

I think the issues are related to these old bugs but I’m not sure if the problems relate to Docker or to Rancher.

To get around the DRS issue, I have to split one of our clusters into two and take out a single node and create a unique VMware cluster for that. Once the k8s clusters has been deployed via Rancher, I can then DRS those k8s cluster VMs back into the main cluster.

Similarly, to get around the SDRS issue, I take out a single LUN from the storage VMware cluster and move it to the top level and target that specific LUN in the deployment template. Once the k8s cluster is deployed I can then storage vmotion the VMs back into the appropriate storage cluster.

These sorts of hacks work OK for development/test but not really ok from an enterprise perspective.
Is there any way these issues can be addressed?

In our development shop, we use nested vSphere instances to simulate multi node VMware environments, so it should technically be possible for Rancher to reproduce these issues relatively easy.

If someone could take a look at these issues it would be great!

Thanks a lot!