Re: [PATCH 2/4] xfs: make file data allocations observe the 'forcealign' flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Darrick,

As mentioned internally, we have an issue for atomic writes [0] that we get an aligned and but not fully-written extent when we initially write a size less than the forcealign size, like:

#/mkfs.xfs -f -d forcealign=16k /dev/sda
...
# mount /dev/sda mnt
# touch  mnt/file
# /test-pwritev2 -a -d -l 4096 -p 0 /root/mnt/file # direct IO, atomic write, 4096B at pos 0
# filefrag -v mnt/file
Filesystem type is: 58465342
File size of mnt/file is 4096 (1 block of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 0: 24.. 24: 1: last,eof
mnt/file: 1 extent found
# /test-pwritev2 -a -d -l 16384 -p 0 /root/mnt/file
wrote -1 bytes at pos 0 write_size=16384
#

This causes an issue for atomic writes in that the 16K write means 2x mappings and then 2x BIOs, which we cannot tolerate.

So how about this change on top:

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 731260a5af6d..6609f1058ae3 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -685,6 +685,12 @@ xfs_can_free_eofblocks(
        end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)XFS_ISIZE(ip));
        if (XFS_IS_REALTIME_INODE(ip) && mp->m_sb.sb_rextsize > 1)
                end_fsb = xfs_rtb_roundup_rtx(mp, end_fsb);
+
+       /* Don't trim eof blocks */
+       if (xfs_inode_force_align(ip)) {
+               end_fsb = roundup_64(end_fsb, xfs_get_extsz_hint(ip));
+       }
+
        last_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
        if (last_fsb <= end_fsb)
                return false;
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 0c7008322326..c906e3a424d1 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -291,6 +291,10 @@ xfs_iomap_write_direct(
                }david@xxxxxxxxxxxxx
        }

+       if (xfs_inode_force_align(ip)) {
+               bmapi_flags = XFS_BMAPI_ZERO;
+       }
+
        error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, dblocks,
                        rblocks, force, &tp);
        if (error)
lines 1-38/38 (END)


Which gives:

#/mkfs.xfs -d forcealign=16k /dev/sda
...
# /test-pwritev2 -a -d -l 4096 -p 0 /root/mnt/file
wrote 4096 bytes at pos 0 write_size=4096
# filefrag -v mnt/file
Filesystem type is: 58465342
File size of mnt/file is 4096 (1 block of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 3: 24.. 27: 4: last,eof
mnt/file: 1 extent found
#
# /test-pwritev2 -a -d -l 16384 -p 0 /root/mnt/file
wrote 16384 bytes at pos 0 write_size=16384
# filefrag -v mnt/file
Filesystem type is: 58465342
File size of mnt/file is 16384 (4 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 3: 24.. 27: 4: last,eof
mnt/file: 1 extent found
#

Or maybe make that change under FS_XFLAG_ATOMICWRITES flag. Previously we were pre-zero'ing the complete file to get around this.

Thanks,
John

[0] https://lore.kernel.org/linux-scsi/20240111161522.GB16626@xxxxxx/T/#mbc6824fbe9ce62c9506aa4c3f281173747695d77 (just referencing for others)




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux