Re: [PATCH 0/2] fix nfsd stable write implementation

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Tue, 30 Oct 2012 10:07:25 -0400

On Tue, Oct 30, 2012 at 10:28:33AM +1100, NeilBrown wrote:
> On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields@xxxxxxxxxx>
> wrote:
> 
> > From: "J. Bruce Fields" <bfields@xxxxxxxxxx>
> > 
> > Peter pointed out to me that the nfs server is implementing stable
> > writes by setting the O_SYNC flag.  I can't see why we couldn't write
> > and then sync instead, but I don't know this stuff as well as I should;
> > does the following look reasonable to people?
> 
> Bruce changed the code to implement stable writes by calling
> vfs_fsync_range().  I can't see why we couldn't use O_SYNC instead.
> 
> It seems like you are making a change just for the sake of making a change.
> Is there some reason that you think a separate 'sync' is more efficient than
> using O_SYNC ?

Oh, sorry, see the changelog on the second patch: the problem is that
the struct file can be shared across multiple writes in the NFSv4 case,
so a single stable write could make all subsequent writes synchronous.

(And some day people would like filehandle caching for v2/v3, in which
case we'll run into the same problem.)

> As a general principle, I think it is best to give the file system as much
> information as possible to that it can make whatever optimisation decisions
> that it wants to.
> 
> Setting O_SYNC gives the filesystem more information than not, because it
> allows it to change the behaviour of the 'write' request... though I don't
> know if any filesystem actually uses the information.

I'm not sure how to figure out if that's a real problem or not.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html