On Thu, 2022-07-28 at 10:59 -0400, Jeff Layton wrote: > On Thu, 2022-07-28 at 14:53 +0000, Trond Myklebust wrote: > > On Thu, 2022-07-28 at 10:24 -0400, Jeff Layton wrote: > > > Olga reported a case of data corruption in a situation where two > > > clients > > > (v3 and v4) were alternately doing open/write/close the same > > > file. > > > > > > Looking at captures, the v3 client failed to perform any of the > > > GETATTR > > > calls for CTO during one of the events, leading it to overwrite > > > some > > > data that had been written by the v4 client. > > > > > > We have two calls that work similarly: put_nfs_open_context and > > > put_nfs_open_context_sync. The only difference is the is_sync > > > parameter > > > that gets passed to close_context. The only caller of the _sync > > > variant > > > is in the close codepath. > > > > > > On a v2/3 mount, if the last reference is not put by > > > put_nfs_open_context_sync, then CTO revalidation never happens. > > > Fix > > > this > > > by adding a new flag to nfs_open_context and set that in > > > put_nfs_open_context_sync. In nfs_close_context, check for that > > > flag > > > when we're checking is_sync and treat them as equivalent. > > > > > > Cc: Scott Mayhew <smayhew@xxxxxxxxxx> > > > Cc: Ben Coddington <bcodding@xxxxxxxxxx> > > > Reported-by: Olga Kornieskaia <kolga@xxxxxxxxxx> > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > > > --- > > > fs/nfs/inode.c | 3 ++- > > > include/linux/nfs_fs.h | 3 ++- > > > 2 files changed, 4 insertions(+), 2 deletions(-) > > > > > > I'm not sure this is the right fix. Maybe we'd be better off just > > > ignoring the is_sync parameter in nfs_close_context? It's > > > probably > > > functionally equivalent anyway. > > > > > > Thoughts? > > > > > > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c > > > index b4e46b0ffa2d..58b506a0a2f2 100644 > > > --- a/fs/nfs/inode.c > > > +++ b/fs/nfs/inode.c > > > @@ -1005,7 +1005,7 @@ void nfs_close_context(struct > > > nfs_open_context > > > *ctx, int is_sync) > > > > > > if (!(ctx->mode & FMODE_WRITE)) > > > return; > > > - if (!is_sync) > > > + if (!is_sync && !test_bit(NFS_CONTEXT_DO_CLOSE, &ctx- > > > >flags)) > > > return; > > > inode = d_inode(ctx->dentry); > > > if (NFS_PROTO(inode)->have_delegation(inode, FMODE_READ)) > > > > NACK. If the 'is_sync' flag is not set, then it is because the > > function > > is being called from a context where it is not safe to do a > > synchronous > > RPC call. > > > > Ok. Any thoughts on the right way to fix this then? It seems like > skipping revalidation because the last put wasn't in the close > codepath > is the wrong thing to do. Should we schedule it to a workqueue? If the attributes are marked as up to date, because they were already revalidated after the I/O was performed, then we're fine. If they are not up to date, then that is expected to trigger a cache invalidation on the next open when we compare the client's cached values to the server-retrieved values for the file attributes. i.e. this ought to be just a performance problem, and not a correctness problem, if attribute cache revalidation is working as designed. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx