For I/O to reflinked blocks we always need to write an entire new file system block, and the code enforces the file system block alignment for the entire file if it has any reflinked blocks. Unfortunately the reported dio alignment can only report a single value for reads and writes, so unless we want to trigger these read-modify write cycles all the time, we need to increase both limits. Without this zoned xfs triggers the warnings about failed page cache invalidation in kiocb_invalidate_post_direct_write all the time when running generic/551 when running on a 512 byte sector device, and eventually fails the test due to miscompares. Hopefully we can add a separate read alignment to statx eventually. Signed-off-by: Christoph Hellwig <hch@xxxxxx> --- fs/xfs/xfs_ioctl.c | 6 +++++- fs/xfs/xfs_iops.c | 15 ++++++++++++++- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 726282e74d54..de8ba5345e17 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1213,7 +1213,11 @@ xfs_file_ioctl( struct xfs_buftarg *target = xfs_inode_buftarg(ip); struct dioattr da; - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize; + da.d_mem = target->bt_logical_sectorsize; + if (xfs_is_cow_inode(ip)) + da.d_miniosz = mp->m_sb.sb_blocksize; + else + da.d_miniosz = target->bt_logical_sectorsize; da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); if (copy_to_user(arg, &da, sizeof(da))) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 6b0228a21617..990df072ba35 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -582,7 +582,20 @@ xfs_report_dioalign( stat->result_mask |= STATX_DIOALIGN; stat->dio_mem_align = bdev_dma_alignment(bdev) + 1; - stat->dio_offset_align = bdev_logical_block_size(bdev); + + /* + * On COW inodes we are forced to always rewrite an entire file system + * block. + * + * Because applications assume they can do sector sized direct writes + * on XFS we provide an emulation by doing a read-modify-write cycle + * through the cache, but that is highly inefficient. Thus report the + * natively supported size here. + */ + if (xfs_is_cow_inode(ip)) + stat->dio_offset_align = ip->i_mount->m_sb.sb_blocksize; + else + stat->dio_offset_align = bdev_logical_block_size(bdev); } static void -- 2.45.2