Remove single NFS client performance bottleneck: Only 4 nfsd active

Sven Breuner <sven@xxxxxxxxxxxx> · Mon, 27 Jan 2020 00:41:19 +0100

Hi,

I'm using the kernel NFS client/server and am trying to read as many small files 
per second as possible from a single NFS client, but seem to run into a bottleneck.

Maybe this is just a tunable that I am missing, because the CPUs on client and 
server side are mostly idle, the 100Gbit (RoCE) network links between client and 
server are also mostly idle and the NVMe drives in the server are also mostly 
idle (and the server has enough RAM to easily fit my test data set in the 
ext4/xfs page cache, but also a 2nd read of the data set from the RAM cache 
doesn't change the result much).

This is my test case:
# Create 1.6M 10KB files through 128 mdtest processes in different directories...
$ mpirun -hosts localhost -np 128 /path/to/mdtest -F -d /mnt/nfs/mdtest -i 1 -I 
100 -z 1 -b 128 -L -u -w 10240 -e 10240 -C

# Read all the files through 128 mdtest processes (the case that matters 
primarily for my test)...
$ mpirun -hosts localhost -np 128 /path/to/mdtest -F -d /mnt/nfs/mdtest -i 1 -I 
100 -z 1 -b 128 -L -u -w 10240 -e 10240 -E

The result is about 20,000 file reads per sec, so only ~200MB/s network throughput.

I noticed in "top" that only 4 nfsd processes are active, so I'm wondering why 
the load is not spread across more of my 64 /proc/fs/nfsd/threads, but even the 
few nfsd processes that are active use less than 50% of their core each. The 
CPUs are shown as >90% idle in "top" on client and server during the read phase.

I've tried:
* CentOS 7.5 and 7.6 kernels (3.10.0-...) on client and server; and Ubuntu 18 
with 4.18 kernel on server side
* TCP & RDMA
* Mounted as NFSv3/v4.1/v4.2
* Increased tcp_slot_table_entries to 1024

...but all that didn't change the fact that only 4 nfsd processes are active on 
the server and thus I'm getting the same result already if /proc/fs/nfsd/threads 
is set to only 4 instead of 64.

Any pointer to how I can overcome this limit will be greatly appreciated.

Thanks in advance

Sven