On Sat, May 30, 2009 at 3:35 AM, Trond Myklebust <trond.myklebust@xxxxxxxxxx> wrote: > On Fri, 2009-05-29 at 13:25 -0400, Brian R Cowan wrote: >> > > What are you smoking? There is _NO_DIFFERENCE_ between what the server > is supposed to do when sent a single stable write, and what it is > supposed to do when sent an unstable write plus a commit. BOTH cases are > supposed to result in the server writing the data to stable storage > before the stable write / commit is allowed to return a reply. This probably makes no difference to the discussion, but for a Linux server there is a subtle difference between what the server is supposed to do and what it actually does. For a stable WRITE rpc, the Linux server sets O_SYNC in the struct file during the vfs_writev() call and expects the underlying filesystem to obey that flag and flush the data to disk. For a COMMIT rpc, the Linux server uses the underlying filesystem's f_op->fsync instead. This results in some potential differences: * The underlying filesystem might be broken in one code path and not the other (e.g. ignoring O_SYNC in f_op->{aio_,}write or silently failing in f_op->fsync). These kinds of bugs tend to be subtle because in the absence of a crash they affect only the timing of IO and so they might not be noticed. * The underlying filesystem might be doing more or better things in one or the other code paths e.g. optimising allocations. * The Linux NFS server ignores the byte range in the COMMIT rpc and flushes the whole file (I suspect this is a historical accident rather than deliberate policy). If there is other dirty data on that file server-side, that other data will be written too before the COMMIT reply is sent. This may have a performance impact, depending on the workload. > The extra RPC round trip (+ parsing overhead ++++) due to the commit > call is the _only_ difference. This is almost completely true. If the server behaved ideally and predictably, this would be completely true. </pedant> -- Greg. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html