[PATCH 08/43] xfs: report the correct dio alignment for COW inodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



For I/O to reflinked blocks we always need to write an entire new
file system block, and the code enforces the file system block alignment
for the entire file if it has any reflinked blocks.

Unfortunately the reported dio alignment can only report a single value
for reads and writes, so unless we want to trigger these read-modify
write cycles all the time, we need to increase both limits.

Without this zoned xfs triggers the warnings about failed page cache
invalidation in kiocb_invalidate_post_direct_write all the time when
running generic/551 when running on a 512 byte sector device, and
eventually fails the test due to miscompares.

Hopefully we can add a separate read alignment to statx eventually.

Signed-off-by: Christoph Hellwig <hch@xxxxxx>
---
 fs/xfs/xfs_ioctl.c |  6 +++++-
 fs/xfs/xfs_iops.c  | 15 ++++++++++++++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 726282e74d54..de8ba5345e17 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1213,7 +1213,11 @@ xfs_file_ioctl(
 		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
 		struct dioattr		da;
 
-		da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
+		da.d_mem = target->bt_logical_sectorsize;
+		if (xfs_is_cow_inode(ip))
+			da.d_miniosz = mp->m_sb.sb_blocksize;
+		else
+			da.d_miniosz = target->bt_logical_sectorsize;
 		da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
 
 		if (copy_to_user(arg, &da, sizeof(da)))
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 6b0228a21617..990df072ba35 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -582,7 +582,20 @@ xfs_report_dioalign(
 
 	stat->result_mask |= STATX_DIOALIGN;
 	stat->dio_mem_align = bdev_dma_alignment(bdev) + 1;
-	stat->dio_offset_align = bdev_logical_block_size(bdev);
+
+	/*
+	 * On COW inodes we are forced to always rewrite an entire file system
+	 * block.
+	 *
+	 * Because applications assume they can do sector sized direct writes
+	 * on XFS we provide an emulation by doing a read-modify-write cycle
+	 * through the cache, but that is highly inefficient.  Thus report the
+	 * natively supported size here.
+	 */
+	if (xfs_is_cow_inode(ip))
+		stat->dio_offset_align = ip->i_mount->m_sb.sb_blocksize;
+	else
+		stat->dio_offset_align = bdev_logical_block_size(bdev);
 }
 
 static void
-- 
2.45.2





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux