On Tue, Jul 16, 2024 at 12:11 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > > 在 2024/7/16 19:44, fdmanana@xxxxxxxxxx 写道: > > From: Filipe Manana <fdmanana@xxxxxxxx> > > > > Test a scenario of a compressed send stream that triggered a bug in the > > extent map merging code introduced in the merge window for 6.11. > > > > The commit that introduced the bug is on its way to Linus' tree and its > > subject is: > > > > "btrfs: introduce new members for extent_map" > > > > The corresponding fix was submitted to the btrfs mailing list, with the > > subject: > > > > "btrfs: fix corrupt read due to bad offset of a compressed extent map" > > > > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx> > > --- > > tests/btrfs/312 | 115 ++++++++++++++++++++++++++++++++++++++++++++ > > tests/btrfs/312.out | 7 +++ > > 2 files changed, 122 insertions(+) > > create mode 100755 tests/btrfs/312 > > create mode 100644 tests/btrfs/312.out > > > > diff --git a/tests/btrfs/312 b/tests/btrfs/312 > > new file mode 100755 > > index 00000000..ebecadc6 > > --- /dev/null > > +++ b/tests/btrfs/312 > > @@ -0,0 +1,115 @@ > > +#! /bin/bash > > +# SPDX-License-Identifier: GPL-2.0 > > +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved. > > +# > > +# FS QA Test 312 > > +# > > +# Test a scenario of a compressed send stream that triggered a bug in the extent > > +# map merging code introduced in the merge window for 6.11. > > +# > > +. ./common/preamble > > +_begin_fstest auto quick send compress > > + > > +_cleanup() > > +{ > > + cd / > > + rm -fr $tmp.* > > + rm -fr $send_files_dir > > +} > > + > > +. ./common/filter > > + > > +_require_btrfs_send_version 2 > > +_require_test > > +_require_scratch > > + > > +_fixed_by_kernel_commit XXXXXXXXXXXX \ > > + "btrfs: fix corrupt read due to bad offset of a compressed extent map" > > + > > +send_files_dir=$TEST_DIR/btrfs-test-$seq > > + > > +rm -fr $send_files_dir > > +mkdir $send_files_dir > > +first_stream="$send_files_dir/1.send" > > +second_stream="$send_files_dir/2.send" > > + > > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "first mkfs failed" > > +_scratch_mount -o compress > > + > > +# Create a compressed extent for the range [108K, 144K[. Since it's a > > +# non-aligned start offset, the first 3K of the extent are filled with zeroes. > > +# The i_size is set to 141K. > > +$XFS_IO_PROG -f -c "pwrite -S 0xab 111K 30K" $SCRATCH_MNT/foo >> $seqres.full > > + > > +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap1 >> $seqres.full > > + > > +# Overwrite a part of the extent we created before. > > +# This will make the send stream include an encoded write (compressed) for the > > +# file range [120K, 128K[. > > +$XFS_IO_PROG -c "pwrite -S 0xcd 120K 8K" $SCRATCH_MNT/foo >> $seqres.full > > + > > +# Now do write after those extents and leaving a hole in between. > > +# This results in expanding the last block of the extent we first created, that > > +# is, in filling with zeroes the file range [141K, 144K[ (3072 bytes), which > > +# belongs to the block in the range [140K, 144K[. > > +# > > +# When the destination filesystem receives from the send stream a write for that > > +# range ([140K, 144K[) it does a btrfs_get_extent() call to find the extent map > > +# containing the offset 140K. There's no loaded extent map covering that range > > +# so it will lookg at the subvolume tree to find a file extent item covering the > > +# range and then finds the file extent item covering the range [108K, 144K[ which > > +# corresponds to the first extent written to the file, before snapshoting. > > +# > > +# Note that at this point in time the destination filesystem processed an encoded > > +# write for the range [120K, 128K[, which created a compressed extent map for > > +# that range and a corresponding ordered extent, which has not yet completed when > > +# it received the write command for the [140K, 144K[ range, so the corresponding > > +# file extent item is not yet in the subvolume tree - that only happens when the > > +# ordered extent completes, at btrfs_finish_one_ordered(). > > +# > > +# So having found a file extent item for the range [108K, 144K[ where 140K falls > > +# into, it tries to add a compressed extent map for that range to the inode's > > +# extent map tree with a call to btrfs_add_extent_mapping() done at > > +# btrfs_get_extent(). That finds there's a loaded overlapping extent map for the > > +# range [120K, 128K[ (the extent from the previous encoded write) and then calls > > +# into merge_extent_mapping(). > > +# > > +# The merging ended adjusting the extent map we attempted to insert, covering > > +# the range [108K, 144K[, to cover instead the range [128K, 144K[ (length 16K) > > +# instead, since there's an existing extent map for the range [120K, 128K[ and > > +# we are looking for a range starting at 140K (and ending at 144K). However it > > +# didn't adjust the extent map's offset from 0 to 20K, resulting in future reads > > +# reading the incorrect range from the underlying extent (108K to 124K, 16K of > > +# length, instead of the 128K to 144K range). > > +# > > +# Note that for the incorrect extent map, and therefore read corruption, to > > +# happen, we depend on specific timings - the ordered extent for the encoded > > +# write for the range [120K, 128K[ must not complete before the destination > > +# of the send stream receives the write command for the range [140K, 144K[. > > +# > > Considering it's timing related, do we need multiple runs to make sure > we can hit it? We run fstests continuously, so there's not much point in that. > > And I'm not sure if fstests is the best place to put all those detailed > info. Why not? I like it there, it explains in detail the problem and without any of that, anyone reading the test will be puzzled about those specific write ranges, ordering, etc. > The _fixed_by_kernel_commit line looks enough to me. Looking at the changelog of the kernel commit doesn't explain everything, because this here is a specific context. That's why I left all those comments explaining everything. > > Thanks, > Qu > > > +$XFS_IO_PROG -c "pwrite -S 0xef 160K 4K" $SCRATCH_MNT/foo >> $seqres.full > > + > > +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap2 >> $seqres.full > > + > > +echo "Checksums in the original filesystem:" > > +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)" > > +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)" > > + > > +$BTRFS_UTIL_PROG send --compressed-data -q -f $first_stream $SCRATCH_MNT/snap1 > > +$BTRFS_UTIL_PROG send --compressed-data -q -f $second_stream \ > > + -p $SCRATCH_MNT/snap1 $SCRATCH_MNT/snap2 > > + > > +_scratch_unmount > > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "second mkfs failed" > > +_scratch_mount > > + > > +$BTRFS_UTIL_PROG receive -q -f $first_stream $SCRATCH_MNT > > +$BTRFS_UTIL_PROG receive -q -f $second_stream $SCRATCH_MNT > > + > > +echo "Checksums in the new filesystem:" > > +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)" > > +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)" > > + > > +# success, all done > > +status=0 > > +exit > > diff --git a/tests/btrfs/312.out b/tests/btrfs/312.out > > new file mode 100644 > > index 00000000..75f1f4cc > > --- /dev/null > > +++ b/tests/btrfs/312.out > > @@ -0,0 +1,7 @@ > > +QA output created by 312 > > +Checksums in the original filesystem: > > +e3ba4a9cbb38d921adc30d7e795d6626 SCRATCH_MNT/snap1/foo > > +4de09f7184f63aa64b481f3031138920 SCRATCH_MNT/snap2/foo > > +Checksums in the new filesystem: > > +e3ba4a9cbb38d921adc30d7e795d6626 SCRATCH_MNT/snap1/foo > > +4de09f7184f63aa64b481f3031138920 SCRATCH_MNT/snap2/foo