Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 30, 2018 at 04:31:58PM -0500, J. Bruce Fields wrote:
> On Tue, Jan 30, 2018 at 07:03:17PM +0000, Terry Barnaby wrote:
> > It looks like each RPC call takes about 0.5ms. Why do there need to be some
> > many RPC calls for this ? The OPEN call could set the attribs, no need for
> > the later GETATTR or SETATTR calls.
> 
> The first SETATTR (which sets ctime and mtime to server's time) seems
> unnecessary, maybe there's a client bug.
> 
> The second looks like tar's fault, strace shows it doing a utimensat()
> on each file.  I don't know why or if that's optional.
> 
> > Even the CLOSE could be integrated with the WRITE and taking this
> > further OPEN could do OPEN, SETATTR, and some WRITE all in one.
> 
> We'd probably need some new protocol to make it safe to return from the
> open systemcall before we've gotten the OPEN reply from the server.
> 
> Write delegations might save us from having to wait for the other
> operations.
> 
> Taking a look at my own setup, I see the same calls taking about 1ms.
> The drives can't do that, so I've got a problem somewhere too....

Whoops, I totally forgot it was still set up with an external journal on
SSD:

	# tune2fs -l /dev/mapper/export-export |grep '^Journal'
	Journal UUID:             dc356049-6e2f-4e74-b185-5357bee73a32
	Journal device:	          0x0803
	Journal backup:           inode blocks
	# blkid --uuid dc356049-6e2f-4e74-b185-5357bee73a32
	/dev/sda3
	# cat /sys/block/sda/device/model 
	INTEL SSDSA2M080

So, most of the data is striped across a couple big hard drives, but the
journal is actually on a small partition on an SSD.

If I remember correctly, I initially tried this with an older intel SSD
and didn't get a performance improvement.  Then I replaced it with this
model which has the "Enhanced Power Loss Data Protection" feature, which
I believe means the write cache is durable, so it should be able to
safely acknowledge writes as soon as they reach the SSD's cache.

And weirdly I think I never actually got around to rerunning these tests
after I installed the new SSD.

Anyway, so that might explain the difference we're seeing.

I'm not sure how to find new SSDs with that feature, but it may be worth
considering as a cheap way to accelerate this kind of workload.  It can
be a very small SSD as it only needs to hold the journal.  Adding an
external journal is a quick operation (you don't have to recreate the
filesystem or anything).

--b.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux