Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 01, 2018 at 08:29:49AM +0000, Terry Barnaby wrote:
> 1. Have an OPEN-SETATTR-WRITE RPC call all in one and a SETATTR-CLOSE call
> all in one. This would reduce the latency of a small file to 1ms rather than
> 3ms thus 66% faster. Would require the client to delay the OPEN/SETATTR
> until the first WRITE. Not sure how possible this is in the implementations.
> Maybe READ's could be improved as well but getting the OPEN through quick
> may be better in this case ?
> 
> 2. Could go further with an OPEN-SETATTR-WRITE-CLOSE RPC call. (0.5ms vs
> 3ms).

The protocol doesn't currently let us delay the OPEN like that,
unfortunately.

What we can do that might help: we can grant a write delegation in the
reply to the OPEN.  In theory that should allow the following operations
to be performed asynchronously, so the untar can immediately issue the
next OPEN without waiting.  (In practice I'm not sure what the current
client will do.)

I'm expecting to get to write delegations this year....

It probably wouldn't be hard to hack the server to return write
delegations even when that's not necessarily correct, just to get an
idea what kind of speedup is available here.

> 3. On sync/async modes personally I think it would be better for the client
> to request the mount in sync/async mode. The setting of sync on the server
> side would just enforce sync mode for all clients. If the server is in the
> default async mode clients can mount using sync or async as to their
> requirements. This seems to match normal VFS semantics and usage patterns
> better.

The client-side and server-side options are both named "sync", but they
aren't really related.  The server-side "async" export option causes the
server to lie to clients, telling them that data has reached disk even
when it hasn't.  This affects all clients, whether they mounted with
"sync" or "async".  It violates the NFS specs, so it is not the default.

I don't understand your proposal.  It sounds like you believe that
mounting on the client side with the "sync" option will make your data
safe even if the "async" option is set on the server side?
Unfortunately that's not how it works.

> 4. The 0.5ms RPC latency seems a bit high (ICMP pings 0.12ms) . Maybe this
> is worth investigating in the Linux kernel processing (how ?) ?

Yes, that'd be interesting to investigate.  With some kernel tracing I
think it should be possible to get high-resolution timings for the
processing of a single RPC call, which would make a good start.

It'd probably also interesting to start with the simplest possible RPC
and then work our way up and see when the RTT increases the most--e.g
does an RPC ping (an RPC with procedure 0, empty argument and reply)
already have a round-trip time closer to .5ms or .12ms?

> 5. The 20ms RPC latency I see in sync mode needs a look at on my system
> although async mode is fine for my usage. Maybe this ends up as 2 x 10ms
> drive seeks on ext4 and is thus expected.

Yes, this is why dedicated file servers have hardware designed to lower
that latency.

As long as you're exporting with "async" and don't care about data
safety across crashes or power outages, I guess you could go all the way
and mount your ext4 export with "nobarrier", I *think* that will let the
system acknowledge writes as soon as they reach the disk's write cache.
I don't recommend that.

Just for fun I dug around a little for cheap options to get safe
low-latency storage:

For Intel you can cross-reference this list:

	https://ark.intel.com/Search/FeatureFilter?productType=solidstatedrives&EPLDP=true

of SSD's with "enhanced power loss data protection" (EPLDP) with
shopping sites and I find e.g. this for US $121:

	https://www.newegg.com/Product/Product.aspx?Item=9SIABVR66R5680

See the "device=" option in the ext4 man pages--you can use that to give
your existing ext4 filesystem an external journal on that device.  I
think you want "data=journal" as well, then writes should normally be
acknowledged once they hit that SSD's write cache, which should be quite
quick.

I was also curious whether there were PCI SSDs, but the cheapest Intel
SSD with EPLDP is the P4800X, at US $1600.

Intel Optane Memory is interesting as it starts at $70.  It doesn't have
EPLDP but latency of the underlying storage might be better even without
that?

I haven't figured out how to get a similar list for other brands.

Just searching for "SSD power loss protection" on newegg:

This also claims "power loss protection" at $53, but I can't find any
reviews:

	https://www.newegg.com/Product/Product.aspx?Item=9SIA1K642V2376&cm_re=ssd_power_loss_protection-_-9SIA1K642V2376-_-Product

Or this?:

	https://www.newegg.com/Product/Product.aspx?Item=N82E16820156153&cm_re=ssd_power_loss_protection-_-20-156-153-_-Product

This is another interesting discussion of the problem:

	https://blogs.technet.microsoft.com/filecab/2016/11/18/dont-do-it-consumer-ssd/

--b.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux