Re: Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2009-06-02 at 11:00 -0400, Chuck Lever wrote:
> On May 30, 2009, at 9:02 AM, Greg Banks wrote:
> > On Sat, May 30, 2009 at 10:26 PM, Trond Myklebust
> > <trond.myklebust@xxxxxxxxxx> wrote:
> >> On Sat, 2009-05-30 at 10:22 +1000, Greg Banks wrote:
> >>> On Sat, May 30, 2009 at 3:35 AM, Trond Myklebust
> >>> <trond.myklebust@xxxxxxxxxx> wrote:
> >>>> On Fri, 2009-05-29 at 13:25 -0400, Brian R Cowan wrote:
> >>>>>
> >>>
> >>
> >> Firstly, the server only uses O_SYNC if you turn off write gathering
> >> (a.k.a. the 'wdelay' option). The default behaviour for the Linux nfs
> >> server is to always try write gathering and hence no O_SYNC.
> >
> > Well, write gathering is a total crock that AFAICS only helps
> > single-file writes on NFSv2.  For today's workloads all it does is
> > provide a hotspot on the two global variables that track writes in an
> > attempt to gather them.  Back when I worked on a server product,
> > no_wdelay was one of the standard options for new exports.
> 
> Really?  Even for NFSv3/4 FILE_SYNC?  I can understand that it  
> wouldn't have any real effect on UNSTABLE.

The question is why would a sensible client ever want to send more than
1 NFSv3 write with FILE_SYNC? If you need to send multiple writes in
parallel to the same file, then it makes much more sense to use
UNSTABLE.

Write gathering relies on waiting an arbitrary length of time in order
to see if someone is going to send another write. The protocol offers no
guidance as to how long that wait should be, and so (at least on the
Linux server) we've coded in a hard wait of 10ms if and only if we see
that something else has the file open for writing.
One problem with the Linux implementation is that the "something else"
could be another nfs server thread that happens to be in nfsd_write(),
however it could also be another open NFSv4 stateid, or a NLM lock, or a
local process that has the file open for writing.
Another problem is that the nfs server keeps a record of the last file
that was accessed, and also waits if it sees you are writing again to
that same file. Of course it has no idea if this is truly a parallel
write, or if it just happens that you are writing again to the same file
using O_SYNC...

  Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux