Hi Dan, On Tue, 2024-06-18 at 18:33 +0300, Dan Aloni wrote: > There are some applications that write to predefined non-overlapping > file offsets from multiple clients and therefore don't need to rely > on > file locking. However, NFS file system behavior of extending writes > to > to deal with write fragmentation, causes those clients to corrupt > each > other's data. > > To help these applications, this change adds the `noextend` parameter > to > the mount options, and handles this case in `nfs_can_extend_write`. > > Clients can additionally add the 'noac' option to ensure page cache > flush on read for modified files. I'm not overly enamoured of the name "noextend". To me that sounds like it might have something to do with preventing appends. Can we find something that is a bit more descriptive? That said, and given your last comment about reads. Wouldn't it be better to have the application use O_DIRECT for these workloads? Turning off attribute caching is both racy and an inefficient way to manage page cache consistency. It forces the client to bombard the server with GETATTR requests in order to check that the page cache is in synch, whereas your description of the workload appears to suggest that the correct assumption should be that it is not in synch. IOW: I'm asking if the better solution might not be to rather implement something akin to Solaris' "forcedirectio"? > > Signed-off-by: Dan Aloni <dan.aloni@xxxxxxxxxxxx> > --- > fs/nfs/fs_context.c | 8 ++++++++ > fs/nfs/super.c | 3 +++ > fs/nfs/write.c | 3 +++ > include/linux/nfs_fs_sb.h | 1 + > 4 files changed, 15 insertions(+) > > diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c > index 6c9f3f6645dd..509718bc5b24 100644 > --- a/fs/nfs/fs_context.c > +++ b/fs/nfs/fs_context.c > @@ -49,6 +49,7 @@ enum nfs_param { > Opt_bsize, > Opt_clientaddr, > Opt_cto, > + Opt_extend, > Opt_fg, > Opt_fscache, > Opt_fscache_flag, > @@ -149,6 +150,7 @@ static const struct fs_parameter_spec > nfs_fs_parameters[] = { > fsparam_u32 ("bsize", Opt_bsize), > fsparam_string("clientaddr", Opt_clientaddr), > fsparam_flag_no("cto", Opt_cto), > + fsparam_flag_no("extend", Opt_extend), > fsparam_flag ("fg", Opt_fg), > fsparam_flag_no("fsc", Opt_fscache_flag), > fsparam_string("fsc", Opt_fscache), > @@ -592,6 +594,12 @@ static int nfs_fs_context_parse_param(struct > fs_context *fc, > else > ctx->flags |= NFS_MOUNT_TRUNK_DISCOVERY; > break; > + case Opt_extend: > + if (result.negated) > + ctx->flags |= NFS_MOUNT_NO_EXTEND; > + else > + ctx->flags &= ~NFS_MOUNT_NO_EXTEND; > + break; > case Opt_ac: > if (result.negated) > ctx->flags |= NFS_MOUNT_NOAC; > diff --git a/fs/nfs/super.c b/fs/nfs/super.c > index cbbd4866b0b7..f27fd3858913 100644 > --- a/fs/nfs/super.c > +++ b/fs/nfs/super.c > @@ -549,6 +549,9 @@ static void nfs_show_mount_options(struct > seq_file *m, struct nfs_server *nfss, > else > seq_puts(m, ",local_lock=posix"); > > + if (nfss->flags & NFS_MOUNT_NO_EXTEND) > + seq_puts(m, ",noextend"); > + > if (nfss->flags & NFS_MOUNT_WRITE_EAGER) { > if (nfss->flags & NFS_MOUNT_WRITE_WAIT) > seq_puts(m, ",write=wait"); > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index 2329cbb0e446..ed76c317b349 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -1315,7 +1315,10 @@ static int nfs_can_extend_write(struct file > *file, struct folio *folio, > struct file_lock_context *flctx = > locks_inode_context(inode); > struct file_lock *fl; > int ret; > + unsigned int mntflags = NFS_SERVER(inode)->flags; > > + if (mntflags & NFS_MOUNT_NO_EXTEND) > + return 0; > if (file->f_flags & O_DSYNC) > return 0; > if (!nfs_folio_write_uptodate(folio, pagelen)) > diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h > index 92de074e63b9..f6d8a4f63e50 100644 > --- a/include/linux/nfs_fs_sb.h > +++ b/include/linux/nfs_fs_sb.h > @@ -157,6 +157,7 @@ struct nfs_server { > #define NFS_MOUNT_WRITE_WAIT 0x02000000 > #define NFS_MOUNT_TRUNK_DISCOVERY 0x04000000 > #define NFS_MOUNT_SHUTDOWN 0x08000000 > +#define NFS_MOUNT_NO_EXTEND 0x10000000 > > unsigned int fattr_valid; /* Valid attributes > */ > unsigned int caps; /* server > capabilities */ -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx