NFS4 server performance

malcolm · July 6, 2015, 1:47pm

Is there a ‘simple’ guide to getting the most from a NFS4 server
Our servers generally have 60-90 users hanging off them and they
are fine but occasionally things conspire so that the full 180 users
are attached to each one and they get very slow.

I have fiddles with the use_kernel_nfsd_number’ and increasing it from
the default 4 to 64 makes a lot of difference but over 128 makes it
slower- I am assuming all those processes exhaust something else

These servers just sit there being NFS4 servers ( for openSUSE
clients ) and nothing else.

Has anyone some guides

Ta

M

Jens-U · July 7, 2015, 7:30pm

Hi M,

the typical bottle necks apply here. Server-wise these are disk i/o, CPU and memory. Network (not only bandwith: NFS has pretty small window sizes, so latency is a factor, too). Protocol overheads, additional services (i.e. DNS delays) and NFS4-specific side effects (i.e. Kerberos performance) will influence, too.

What you haven’t told is the work load these users run across the NFS mounts. Are these home directories of up to 180 users, with all the tempfiles and databases (KDE, Firefox,…) running via NFS?

The main factors in our environment were disk I/O throughput and latencies (since switching to bcache, that has been no problem anymore), and setting the “async” options server-side.

use_kernel_nfsd_number

It depends on the number of requests that come in in parallel - “4” is a very conservative number, more suitable for test setups We’re running at 128. Take a look at /proc/net/rpc/nfsd (the “th” values), the first value is the number of threads currently active and the second number counts the number of times all threads were busy. Though, depending on your kernel NFS server version, you may not have reasonable values there (all zeros), then /proc/fs/nfsd/pool_stats is the file to look at (see knfsd-stats.txt for a description).

Regards,
Jens

malcolm · July 8, 2015, 10:46am

Hi, Thanks for replying

Yes, they are home directories ( for openSUSE 13.2 clients )
Looking at /proc/net/rpc/nfsd shows 0 in the second column
I was wondering if nfsd is running out of buffer space - The machines have 12G of ram so it could have most of it
It seems to happen more often when 180 users are all using LibreOffice - I am going to investigate LO’s file locking as a suspect

Ta

M

Jens-U · July 8, 2015, 1:49pm

Hi M,

It seems to happen more often when 180 users are all using LibreOffice - I am going to investigate LO’s file locking as a suspect

and try to run some long-term statistics on server memory/cpu/network/disk i/o. If you have no systems management tool set up for this, fetch MRTG or alike and monitor the appropriate SNMP variables. The resulting graphs may shed some light on the actual bottle neck.

Regards,
Jens

malcolm · July 8, 2015, 5:23pm

I looked at /proc/fs/nfsd and the ‘sockets-enqueued’ is significantly non-zero

as in - about half a million… hmmm

M

Jens-U · July 8, 2015, 6:49pm

Hi M,

[QUOTE=interele;28674]I looked at /proc/fs/nfsd and the ‘sockets-enqueued’ is significantly non-zero

as in - about half a million… hmmm

M[/QUOTE]

I guess the more important question is - since when? You wrote that you started with 4 threads… if no server reboot was inbetween, then those large numbers could have their origin in those times.

It’s more important to monitor that number (as in “increase per time unit”) and correlate that with reports of bad overall performance.

Regards,
Jens

Topic		Replies	Views
NFS4 Issues SLES1 11 SLES Configure-Administer	2	184	November 15, 2012
config on nfs client to control the nfs requests rate SLES Networking	1	292	December 8, 2016
How NFS v. 4 works? SLES Networking	3	232	September 9, 2011
NFS / SMB General Discussion	0	469	December 22, 2020
SLES 15 SP4 NFS Mount not working SLES Networking	0	352	January 22, 2024

NFS4 server performance

Related topics