I have the following strange behavior when trying to implement a lizardfs cluster in rancher:
Lizardfs uses a “master” server (which controls the cluster) and multiple “chunk” servers (which are the ones that store files). By default, “chunk” servers are connected to “master” server using the name “mfsmaster”.
I create the service for the “master” container in rancher. Then I create the service for “chunk” containers.
What happens is that the “chunk” containers can’t resolve the name “mfsmaster” though I specify a service link to the “master” service as name “mfsmaster”.
The strange thing is that if I edit the “chunk” service deleting the service link and then edit the service creating the service link again, “chunk” containers begin to resolve the name “mfsmaster”.
I had the same symptoms when I deployed an user space NFS client/server service, ie the clients didn’t resolve the server’s name. Of course, I added the service link on the clients.
Why, at first, the service that has the service link can’t resolve and after removing the link and re-enter everything it starts to work?
Once applied that workaround the problem doesn’t occur again. Even stopping/starting services, or docker or rebooting the physical servers.