On Wed, Jan 8, 2025 at 1:16 PM John Garry <john.g.garry@xxxxxxxxxx> wrote: > > > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > > index c488ae26b23d0..2542f15496488 100644 > > --- a/fs/xfs/xfs_file.c > > +++ b/fs/xfs/xfs_file.c > > @@ -777,9 +777,10 @@ xfs_file_buffered_write( > > ssize_t ret; > > bool cleared_space = false; > > unsigned int iolock; > > + bool atomic_write = iocb->ki_flags & IOCB_ATOMIC; > > > > write_retry: > > - iolock = XFS_IOLOCK_EXCL; > > + iolock = atomic_write ? XFS_IOLOCK_SHARED : XFS_IOLOCK_EXCL; > > ret = xfs_ilock_iocb(iocb, iolock); > > -- > > > > xfs_file_write_checks() afterwards already takes care of promoting > > XFS_IOLOCK_SHARED to XFS_IOLOCK_EXCL for extending writes. > > > > It is possible that XFS_IOLOCK_EXCL could be immediately demoted > > back to XFS_IOLOCK_SHARED for atomic_writes as done in > > xfs_file_dio_write_aligned(). > > > > TBH, I am not sure which blockdevs support 4K atomic writes that could > > be used to test this. > > > > John, can you share your test setup instructions for atomic writes? > > Please note that IOCB_ATOMIC is not supported for buffered IO, so we > can't do this - we only support direct IO today. Oops. I see now. > > And supporting buffered IO has its challenges; how to handle overlapping > atomic writes of differing sizes sitting in the page cache is the main > issue which comes to mind. > How about the combination of RWF_ATOMIC | RWF_UNCACHED [1] Would it be easier/possible to support this considering that the write of folio is started before the write system call returns? Note that application that desires mutithreaded atomicity of writes vs. reads will only need to opt-in for RWF_ATOMIC | RWF_UNCACHED writes, so this is not expected to actually break its performance by killing the read caching. Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20241220154831.1086649-1-axboe@xxxxxxxxx/