On Wed, Feb 21, 2024 at 09:49:28AM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > A data corruption problem was reported by CoreOS image builders > when using reflink based disk image copies and then converting > them to qcow2 images. The converted images failed the conversion > verification step, and it was isolated down to the fact that > qemu-img uses SEEK_HOLE/SEEK_DATA to find the data it is supposed to > copy. > > The reproducer allowed me to isolate the issue down to a region of > the file that had overlapping data and COW fork extents, and the > problem was that the COW fork extent was being reported in it's > entirity by xfs_seek_iomap_begin() and so skipping over the real > data fork extents in that range. > > This was somewhat hidden by the fact that 'xfs_bmap -vvp' reported > all the extents correctly, and reading the file completely (i.e. not > using seek to skip holes) would map the file correctly and all the > correct data extents are read. Hence the problem is isolated to just > the xfs_seek_iomap_begin() implementation. > > Instrumentation with trace_printk made the problem obvious: we are > passing the wrong length to xfs_trim_extent() in > xfs_seek_iomap_begin(). We are passing the end_fsb, not the > maximum length of the extent we want to trim the map too. Hence the > COW extent map never gets trimmed to the start of the next data fork > extent, and so the seek code treats the entire COW fork extent as > unwritten and skips entirely over the data fork extents in that > range. > > Link: https://github.com/coreos/coreos-assembler/issues/3728 > Fixes: 60271ab79d40 ("xfs: fix SEEK_DATA for speculative COW fork preallocation") > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/xfs_iomap.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index 18c8f168b153..055cdec2e9ad 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -1323,7 +1323,7 @@ xfs_seek_iomap_begin( > if (cow_fsb != NULLFILEOFF && cow_fsb <= offset_fsb) { > if (data_fsb < cow_fsb + cmap.br_blockcount) > end_fsb = min(end_fsb, data_fsb); > - xfs_trim_extent(&cmap, offset_fsb, end_fsb); > + xfs_trim_extent(&cmap, offset_fsb, end_fsb - offset_fsb); Doh. Is there a reproducer we can hammer into a fstests regression test? Sure would be nice if the type system actually caught things like this for us. Anyway thanks for fixing this, Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED); > error = xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, > IOMAP_F_SHARED, seq); > @@ -1348,7 +1348,7 @@ xfs_seek_iomap_begin( > imap.br_state = XFS_EXT_NORM; > done: > seq = xfs_iomap_inode_sequence(ip, 0); > - xfs_trim_extent(&imap, offset_fsb, end_fsb); > + xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb); > error = xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq); > out_unlock: > xfs_iunlock(ip, lockmode); > -- > 2.43.0 > >