On 9/19/19 7:11 PM, Trond Myklebust wrote:
No. It is not a problem, because nfs-utils defaults to using TCP
mounts. Fragmentation is only a problem with UDP, and we stopped
defaulting to that almost 2 decades ago.
However it may well be that klibc is still defaulting to using UDP, in
which case it should be fixed. There are major Linux distros out there
today that don't even compile in support for NFS over UDP any more.
I haven't tested with UDP at all; the problem was with TCP.
I saw the problem in klibc nfsmount with TCP + NFS 3,
and in `mount -t nfs -o timeo=7 server:/share /mnt` with TCP + NFS 4.2.
Steps to reproduce:
1) Connect server <=> client at 10 or 100 Mbps.
Gigabit is also "less snappy" but it's less obvious there.
For reliable results, I made sure that server/client/network didn't have
any other load at all.
2) Server:
echo '/srv *(ro,async,no_subtree_check)' >> /etc/exports
exportfs -ra
truncate -s 10G /srv/10G.file
The sparse file ensures that disk IO bandwidth isn't an issue.
3) Client:
mount -t nfs -o timeo=7 192.168.1.112:/srv /mnt
dd if=/mnt/10G.file of=/dev/null status=progress
4) Result:
dd there starts with 11.2 MB/sec, which is fine/expected,
and it slowly drops to 2 MB/sec after a while,
it lags, omitting some seconds in its output line,
e.g. 507510784 bytes (508 MB, 484 MiB) copied, 186 s, 2,7 MB/s^C,
at which point "Ctrl+C" needs 30+ seconds to stop dd,
because of IO waiting etc.
In another terminal tab, `dmesg -w` is full of these:
[ 316.404250] nfs: server 192.168.1.112 not responding, still trying
[ 316.759512] nfs: server 192.168.1.112 OK
5) Remarks:
With timeo=600, there are no errors in dmesg.
The fact that timeo=7 (the nfsmount default) causes errors, proves that
some packets need more than 0.7 secs to arrive.
Which in turn explains why all the applications open extremely slowly
and feel sluggish on netroot = 100 Mbps, NFS, TCP.
Lowering rsize,wsize from 1M to 32K solves all those issues without any
negative side effects that I can see. Even on gigabit, 32K makes
applications a lot more snappy so it's better even there.
On 10 Mbps, rsize=1M is completely unusable.
So I'm not sure where rsize=1M is a better default. Is it only for 10G+
connections?
Thank you very much,
Alkis Georgopoulos