Hi, we have a two node SLES High Available NFS-cluster in Azure, using corosync and drbd to replicate the date to the hot-standby node.
Sometimes the system has problems and all resources are eaten completely. Load Average skyrocket etc and after a while we have to let the NFS-shares been take-over by the standby node.
Two questions:
-
I noticed that some directories on our High Available NFS share has 100.000 files or more in that directory. Can this cause a system to be in trouble?
-
After there has been a spike in the Load Average of the system I see the rootvg is mostly busy with itself I assume to get it’s administration in order of syncing the filesystems between the production node and the hot-standby. Having such a huge amount of files in one directory can make that even worse I think. Is this assumption correct?
Regards,
Gerard