On Wed, 2024-06-19 at 11:37 +0300, Sagi Grimberg wrote: > > > On 18/06/2024 21:59, Trond Myklebust wrote: > > Hi Dan, > > > > On Tue, 2024-06-18 at 18:33 +0300, Dan Aloni wrote: > > > There are some applications that write to predefined non- > > > overlapping > > > file offsets from multiple clients and therefore don't need to > > > rely > > > on > > > file locking. However, NFS file system behavior of extending > > > writes > > > to > > > to deal with write fragmentation, causes those clients to corrupt > > > each > > > other's data. > > > > > > To help these applications, this change adds the `noextend` > > > parameter > > > to > > > the mount options, and handles this case in > > > `nfs_can_extend_write`. > > > > > > Clients can additionally add the 'noac' option to ensure page > > > cache > > > flush on read for modified files. > > I'm not overly enamoured of the name "noextend". To me that sounds > > like > > it might have something to do with preventing appends. Can we find > > something that is a bit more descriptive? > > nopbw (No page boundary writes) ? > > > > > That said, and given your last comment about reads. Wouldn't it be > > better to have the application use O_DIRECT for these workloads? > > Turning off attribute caching is both racy and an inefficient way > > to > > manage page cache consistency. It forces the client to bombard the > > server with GETATTR requests in order to check that the page cache > > is > > in synch, whereas your description of the workload appears to > > suggest > > that the correct assumption should be that it is not in synch. > > > > IOW: I'm asking if the better solution might not be to rather > > implement > > something akin to Solaris' "forcedirectio"? > > This access pattern represents a common case in HPC where different > workers > write records to a shared output file which do not necessarily align > to > a page boundary. > > This is not everything that the app is doing nor the only file it is > accessing, so IMO forcing > directio universally is may penalize the application. Worse than forcing an attribute revalidation on every read? BTW: We've been asked about the same issue from some of our customers, and are planning on solving the problem by adding a new per-file attribute to the NFSv4.2 protocol. The detection of that NOCACHE attribute would cause the client to automatically choose O_DIRECT on file open, overriding the default buffered I/O model. So this would allow the user or sysadmin to specify at file creation time that this file will be used for purposes that are incompatible with caching. If set on a directory, the same attribute would cause the client not to cache the READDIR contents. This is useful when dealing with directories where a Windows sysadmin may have set an Access Based Enumeration property. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx