Re: Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 30, 2009 at 3:35 AM, Trond Myklebust
<trond.myklebust@xxxxxxxxxx> wrote:
> On Fri, 2009-05-29 at 13:25 -0400, Brian R Cowan wrote:
>>
>
> What are you smoking? There is _NO_DIFFERENCE_ between what the server
> is supposed to do when sent a single stable write, and what it is
> supposed to do when sent an unstable write plus a commit. BOTH cases are
> supposed to result in the server writing the data to stable storage
> before the stable write / commit is allowed to return a reply.

This probably makes no difference to the discussion, but for a Linux
server there is a subtle difference between what the server is
supposed to do and what it actually does.

For a stable WRITE rpc, the Linux server sets O_SYNC in the struct
file during the vfs_writev() call and expects the underlying
filesystem to obey that flag and flush the data to disk.  For a COMMIT
rpc, the Linux server uses the underlying filesystem's f_op->fsync
instead.  This results in some potential differences:

 * The underlying filesystem might be broken in one code path and not
the other (e.g. ignoring O_SYNC in f_op->{aio_,}write or silently
failing in f_op->fsync).  These kinds of bugs tend to be subtle
because in the absence of a crash they affect only the timing of IO
and so they might not be noticed.

 * The underlying filesystem might be doing more or better things in
one or the other code paths e.g. optimising allocations.

 * The Linux NFS server ignores the byte range in the COMMIT rpc and
flushes the whole file (I suspect this is a historical accident rather
than deliberate policy).  If there is other dirty data on that file
server-side, that other data will be written too before the COMMIT
reply is sent.  This may have a performance impact, depending on the
workload.

> The extra RPC round trip (+ parsing overhead ++++) due to the commit
> call is the _only_ difference.

This is almost completely true.  If the server behaved ideally and
predictably, this would be completely true.

</pedant>

-- 
Greg.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux