Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

Terry Barnaby <terry1@xxxxxxxxxxx> · Tue, 30 Jan 2018 15:29:41 +0000

On 30/01/18 15:09, J. Bruce Fields wrote:
On Tue, Jan 30, 2018 at 08:49:27AM +0000, Terry Barnaby wrote:
On 29/01/18 22:28, J. Bruce Fields wrote:
On Mon, Jan 29, 2018 at 08:37:50PM +0000, Terry Barnaby wrote:
Ok, that's a shame unless NFSv4's write performance with small files/dirs
is relatively ok which it isn't on my systems.
Although async was "unsafe" this was not an issue in main standard
scenarios such as an NFS mounted home directory only being used by one
client.
The async option also does not appear to work when using NFSv3. I guess it
was removed from that protocol at some point as well ?
This isn't related to the NFS protocol version.

I think everybody's confusing the server-side "async" export option with
the client-side mount "async" option.  They're not really related.

The unsafe thing that speeds up file creates is the server-side "async"
option.  Sounds like you tried to use the client-side mount option
instead, which wouldn't do anything.

What is the expected sort of write performance when un-taring, for example,
the linux kernel sources ? Is 2 MBytes/sec on average on a Gigabit link
typical (3 mins to untar 4.14.15) or should it be better ?
It's not bandwidth that matters, it's latency.

The file create isn't allowed to return until the server has created the
file and the change has actually reached disk.

So an RPC has to reach the server, which has to wait for disk, and then
the client has to get the RPC reply.  Usually it's the disk latency that
dominates.

And also the final close after the new file is written can't return
until all the new file data has reached disk.

v4.14.15 has 61305 files:

	$ git ls-tree -r  v4.14.15|wc -l
	61305

So time to create each file was about 3 minutes/61305 =~ 3ms.

So assuming two roundtrips per file, your disk latency is probably about
1.5ms?

You can improve the storage latency somehow (e.g. with a battery-backed
write cache) or use more parallelism (has anyone ever tried to write a
parallel untar?).  Or you can cheat and set the async export option, and
then the server will no longer wait for disk before replying.  The
problem is that on server reboot/crash, the client's assumptions about
which operations succeeded may turn out to be wrong.

--b.
Many thanks for your reply.

Yes, I understand the above (latency and normally synchronous nature of
NFS). I have async defined in the servers /etc/exports options. I have,
later, also defined it on the client side as the async option on the server
did not appear to be working and I wondered if with ongoing changes it had
been moved there (would make some sense for the client to define it and pass
this option over to the server as it knows, in most cases, if the bad
aspects of async would be an issue to its usage in the situation in
question).

It's a server with large disks, so SSD is not really an option. The use of
async is ok for my usage (mainly /home mounted and users home files only in
use by one client at a time etc etc.).
Note it's not concurrent access that will cause problems, it's server
crashes.  A UPS may reduce the risk a little.

However I have just found that async is actually working! I just did not
believe it was, due to the poor write performance. Without async on the
server the performance is truly abysmal. The figures I get for untaring the
kernel sources (4.14.15 895MBytes untared) using "rm -fr linux-4.14.15;
sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; sync)" are:

Untar on server to its local disk:  13 seconds, effective data rate: 68
MBytes/s

Untar on server over NFSv4.2 with async on server:  3 minutes, effective
data rate: 4.9 MBytes/sec

Untar on server over NFSv4.2 without async on server:  2 hours 12 minutes,
effective data rate: 115 kBytes/s !!
2:12 is 7920 seconds, and you've got 61305 files to write, so that's
about 130ms/file.  That's more than I'd expect even if you're waiting
for a few seeks on each file create, so there may indeed be something
wrong.

By comparison on my little home server (Fedora, ext4, a couple WD Black
1TB drives), with sync, that untar takes is 7:44, about 8ms/file.
Ok, that is far more reasonable, so something is up on my systems :)
What speed do you get with the server export set to async ?

What's the disk configuration and what filesystem is this?
Those tests above were to a single: SATA Western Digital Red 3TB, WDC 
WD30EFRX-68EUZN0 using ext4.
Most of my tests have been to software RAID1 SATA disks, Western Digital 
Red 2TB on one server and Western Digital RE4 2TB WDC WD2003FYYS-02W0B1 
on another quad core Xeon server all using ext4 and all having plenty of 
RAM.
All on stock Fedora27 (both server and client) updated to date.

Is it really expected for NFS to be this bad these days with a reasonably
typical operation and are there no other tuning parameters that can help  ?
It's expected that the performance of single-threaded file creates will
depend on latency, not bandwidth.

I believe high-performance servers use battery backed write caches with
storage behind them that can do lots of IOPS.

(One thing I've been curious about is whether you could get better
performance cheap on this kind of workload ext3/4 striped across a few
drives and an external journal on SSD.  But when I experimented with
that a few years ago I found synchronous write latency wasn't much
better.  I didn't investigate why not, maybe that's just the way SSDs
are.)

--b.

_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx