On Mon, Mar 10, 2025 at 12:10:44PM +0000, John Garry wrote: > On 09/03/2025 22:03, Dave Chinner wrote: > > On Mon, Mar 03, 2025 at 05:11:20PM +0000, John Garry wrote: > > > diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h > > > index 4b721d935994..e6baa81e20d8 100644 > > > --- a/fs/xfs/libxfs/xfs_bmap.h > > > +++ b/fs/xfs/libxfs/xfs_bmap.h > > > @@ -87,6 +87,9 @@ struct xfs_bmalloca { > > > /* Do not update the rmap btree. Used for reconstructing bmbt from rmapbt. */ > > > #define XFS_BMAPI_NORMAP (1u << 10) > > > +/* Try to align allocations to the extent size hint */ > > > +#define XFS_BMAPI_EXTSZALIGN (1u << 11) > > > > Don't we already do that? > > > > Or is this doing something subtle and non-obvious like overriding > > stripe width alignment for large atomic writes? > > > > stripe alignment only comes into play for eof allocation. > > args->alignment is used in xfs_alloc_compute_aligned() to actually align the > start bno. > > If I don't have this, then we can get this ping-pong affect when overwriting > atomically the same region: > > # dd if=/dev/zero of=mnt/file bs=1M count=10 conv=fsync > # xfs_bmap -vp mnt/file > mnt/file: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..20479]: 192..20671 0 (192..20671) 20480 000000 > # /xfs_io -d -C "pwrite -b 64k -V 1 -A -D 0 64k" mnt/file > wrote 65536/65536 bytes at offset 0 > 64 KiB, 1 ops; 0.0525 sec (1.190 MiB/sec and 19.0425 ops/sec) > # xfs_bmap -vp mnt/file > mnt/file: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..127]: 20672..20799 0 (20672..20799) 128 000000 > 1: [128..20479]: 320..20671 0 (320..20671) 20352 000000 > # /xfs_io -d -C "pwrite -b 64k -V 1 -A -D 0 64k" mnt/file > wrote 65536/65536 bytes at offset 0 > 64 KiB, 1 ops; 0.0524 sec (1.191 MiB/sec and 19.0581 ops/sec) > # xfs_bmap -vp mnt/file > mnt/file: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..20479]: 192..20671 0 (192..20671) 20480 000000 > # /xfs_io -d -C "pwrite -b 64k -V 1 -A -D 0 64k" mnt/file > wrote 65536/65536 bytes at offset 0 > 64 KiB, 1 ops; 0.0524 sec (1.191 MiB/sec and 19.0611 ops/sec) > # xfs_bmap -vp mnt/file > mnt/file: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..127]: 20672..20799 0 (20672..20799) 128 000000 > 1: [128..20479]: 320..20671 0 (320..20671) 20352 000000 > > We are never getting aligned extents wrt write length, and so have to fall > back to the SW-based atomic write always. That is not what we want. Please add a comment to explain this where the XFS_BMAPI_EXTSZALIGN flag is set, because it's not at all obvious what it is doing or why it is needed from the name of the variable or the implementation. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx