On Wed, Dec 11, 2019 at 10:41 AM Qu Wenruo <wqu@xxxxxxxx> wrote: > > [BUG] > When using btrfs-progs v5.4, btrfs/142 and btrfs/143 will fail: > btrfs/142 1s ... - output mismatch (see xfstests/results//btrfs/142.out.bad) > --- tests/btrfs/142.out 2018-09-16 21:30:48.505104287 +0100 > +++ xfstests/results//btrfs/142.out.bad > 2019-12-10 15:35:40.280392626 +0000 > @@ -3,37 +3,37 @@ > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > wrote 65536/65536 bytes > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > ... > (Run 'diff -u xfstests/tests/btrfs/142.out xfstests/results//btrfs/142.out.bad' to see the entire diff) > > [CAUSE] > Btrfs/14[23] test whether a read on corrupted stripe will re-silver > itself. > Such test by its nature will need to modify on-disk data, thus need to > get the btrfs logical -> physical mapping, which is done by near > hard-coded lookup function, which rely on certain stripe:devid sequence. > > Recent btrfs-progs commit c501c9e3b816 ("btrfs-progs: mkfs: match devid > order to the stripe index") changes how we use devices in mkfs.btrfs, > this caused a change in chunk layout, and break the hard-coded > stripe:devid sequence. > > [FIX] > This patch will do full devid and physical offset lookup, instead of old > physical offset only lookup. > > The only assumption made is, mkfs.btrfs assigns devid sequentially for > its devices. > Which means, for "mkfs.btrfs $dev1 $dev2 $dev3", we get devid 1 for $dev1, > devid 2 for $dev2, and so on. > > This change will allow btrfs/14[23] to handle even future chunk layout > change. (Although I hope this will never happen again). > > This also addes extra debug output (although less than 10 lines) into > $seqres.full, just in case when layout changes and current lookup can't > handle it, developer can still pindown the problem easily. > > Reported-by: Filipe Manana <fdmanana@xxxxxxxx> > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > --- > tests/btrfs/142 | 50 +++++++++++++++++++++++++++++++-------------- > tests/btrfs/142.out | 2 -- > tests/btrfs/143 | 48 +++++++++++++++++++++++++++++++------------ > tests/btrfs/143.out | 2 -- > 4 files changed, 70 insertions(+), 32 deletions(-) > > diff --git a/tests/btrfs/142 b/tests/btrfs/142 > index a23fe1bf..9c037ff6 100755 > --- a/tests/btrfs/142 > +++ b/tests/btrfs/142 > @@ -47,30 +47,45 @@ _require_command "$FILEFRAG_PROG" filefrag > > get_physical() > { > - # $1 is logical address > - # print chunk tree and find devid 2 which is $SCRATCH_DEV > + local logical=$1 > + local stripe=$2 > $BTRFS_UTIL_PROG inspect-internal dump-tree -t 3 $SCRATCH_DEV | \ > - grep $1 -A 6 | awk '($1 ~ /stripe/ && $3 ~ /devid/ && $4 ~ /1/) { print $6 }' > + grep $logical -A 6 | \ > + awk "(\$1 ~ /stripe/ && \$3 ~ /devid/ && \$2 ~ /$stripe/) { print \$6 }" Should use $AWK_PROG instead of direct call to 'awk'. > } > > +get_devid() > +{ > + local logical=$1 > + local stripe=$2 > + $BTRFS_UTIL_PROG inspect-internal dump-tree -t 3 $SCRATCH_DEV | \ > + grep $logical -A 6 | \ > + awk "(\$1 ~ /stripe/ && \$3 ~ /devid/ && \$2 ~ /$stripe/) { print \$4 }" Same here. > +} > > -SYSFS_BDEV=`_sysfs_dev $SCRATCH_DEV` > +get_device_path() > +{ > + local devid=$1 > + echo "$SCRATCH_DEV_POOL" | awk "{print \$$devid}" > +} > > start_fail() > { > + local sysfs_bdev="$1" > echo 100 > $DEBUGFS_MNT/fail_make_request/probability > echo 2 > $DEBUGFS_MNT/fail_make_request/times > echo 1 > $DEBUGFS_MNT/fail_make_request/task-filter > echo 0 > $DEBUGFS_MNT/fail_make_request/verbose > - echo 1 > $SYSFS_BDEV/make-it-fail > + echo 1 > $sysfs_bdev/make-it-fail > } > > stop_fail() > { > + local sysfs_bdev="$1" > echo 0 > $DEBUGFS_MNT/fail_make_request/probability > echo 0 > $DEBUGFS_MNT/fail_make_request/times > echo 0 > $DEBUGFS_MNT/fail_make_request/task-filter > - echo 0 > $SYSFS_BDEV/make-it-fail > + echo 0 > $sysfs_bdev/make-it-fail > } > > _scratch_dev_pool_get 2 > @@ -87,17 +102,23 @@ _scratch_mount -o nospace_cache,nodatasum > $XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 128K 0 128K" "$SCRATCH_MNT/foobar" |\ > _filter_xfs_io_offset > > -# step 2, corrupt the first 64k of one copy (on SCRATCH_DEV which is the first > -# one in $SCRATCH_DEV_POOL > +# step 2, corrupt the first 64k of stripe #1 > echo "step 2......corrupt file extent" >>$seqres.full > > ${FILEFRAG_PROG} -v $SCRATCH_MNT/foobar >> $seqres.full > logical_in_btrfs=`${FILEFRAG_PROG} -v $SCRATCH_MNT/foobar | _filter_filefrag | cut -d '#' -f 1` > -physical_on_scratch=`get_physical ${logical_in_btrfs}` > +physical=`get_physical ${logical_in_btrfs} 1` > +devid=$(get_devid ${logical_in_btrfs} 1) > +target_dev=$(get_device_path $devid) > + > +SYSFS_BDEV=`_sysfs_dev $target_dev` > > _scratch_unmount > -$XFS_IO_PROG -d -c "pwrite -S 0xbb -b 64K $physical_on_scratch 64K" $SCRATCH_DEV |\ > - _filter_xfs_io_offset > +$BTRFS_UTIL_PROG ins dump-tree -t 3 $SCRATCH_DEV | \ > + grep $logical_in_btrfs -A 6 >> $seqres.full > +echo "Corrupt stripe 1 devid $devid devpath $target_dev physical $physical" \ > + >> $seqres.full > +$XFS_IO_PROG -d -c "pwrite -S 0xbb -b 64K $physical 64K" $target_dev > /dev/null > > _scratch_mount -o nospace_cache > > @@ -106,8 +127,7 @@ echo "step 3......repair the bad copy" >>$seqres.full > > # since raid1 consists of two copies, and the bad copy was put on stripe #1 > # while the good copy lies on stripe #0, the bad copy only gets access when the > -# reader's pid % 2 == 1 is true > -start_fail > +start_fail $SYSFS_BDEV > while [[ -z ${result} ]]; do > # enable task-filter only fails the following dio read so the repair is > # supposed to work. > @@ -117,12 +137,12 @@ while [[ -z ${result} ]]; do > exec $XFS_IO_PROG -d -c \"pread -b 128K 0 128K\" \"$SCRATCH_MNT/foobar\" > fi"); > done > -stop_fail > +stop_fail $SYSFS_BDEV > > _scratch_unmount > > # check if the repair works > -$XFS_IO_PROG -c "pread -v -b 512 $physical_on_scratch 512" $SCRATCH_DEV |\ > +$XFS_IO_PROG -c "pread -v -b 512 $physical 512" $target_dev | \ > _filter_xfs_io_offset > > _scratch_dev_pool_put > diff --git a/tests/btrfs/142.out b/tests/btrfs/142.out > index 0f32ffbe..2e22f292 100644 > --- a/tests/btrfs/142.out > +++ b/tests/btrfs/142.out > @@ -1,8 +1,6 @@ > QA output created by 142 > wrote 131072/131072 bytes > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > -wrote 65536/65536 bytes > -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > diff --git a/tests/btrfs/143 b/tests/btrfs/143 > index 91af52f9..b2ffeb63 100755 > --- a/tests/btrfs/143 > +++ b/tests/btrfs/143 > @@ -54,30 +54,48 @@ _require_command "$FILEFRAG_PROG" filefrag > > get_physical() > { > - # $1 is logical address > - # print chunk tree and find devid 2 which is $SCRATCH_DEV > + local logical=$1 > + local stripe=$2 > $BTRFS_UTIL_PROG inspect-internal dump-tree -t 3 $SCRATCH_DEV | \ > - grep $1 -A 6 | awk '($1 ~ /stripe/ && $3 ~ /devid/ && $4 ~ /1/) { print $6 }' > + grep $logical -A 6 | \ > + awk "(\$1 ~ /stripe/ && \$3 ~ /devid/ && \$2 ~ /$stripe/) { print \$6 }" Same here. > +} > + > +get_devid() > +{ > + local logical=$1 > + local stripe=$2 > + $BTRFS_UTIL_PROG inspect-internal dump-tree -t 3 $SCRATCH_DEV | \ > + grep $logical -A 6 | \ > + awk "(\$1 ~ /stripe/ && \$3 ~ /devid/ && \$2 ~ /$stripe/) { print \$4 }" Same here. > +} > + > +get_device_path() > +{ > + local devid=$1 > + echo "$SCRATCH_DEV_POOL" | awk "{print \$$devid}" > } > > SYSFS_BDEV=`_sysfs_dev $SCRATCH_DEV` > > start_fail() > { > + local sysfs_bdev="$1" > echo 100 > $DEBUGFS_MNT/fail_make_request/probability > # the 1st one fails the first bio which is reading 4k (or more due to > # readahead), and the 2nd one fails the retry of validation so that it > # triggers read-repair > echo 2 > $DEBUGFS_MNT/fail_make_request/times > echo 0 > $DEBUGFS_MNT/fail_make_request/verbose > - echo 1 > $SYSFS_BDEV/make-it-fail > + echo 1 > $sysfs_bdev/make-it-fail > } > > stop_fail() > { > + local sysfs_bdev="$1" > echo 0 > $DEBUGFS_MNT/fail_make_request/probability > echo 0 > $DEBUGFS_MNT/fail_make_request/times > - echo 0 > $SYSFS_BDEV/make-it-fail > + echo 0 > $sysfs_bdev/make-it-fail > } > > _scratch_dev_pool_get 2 > @@ -94,17 +112,21 @@ _scratch_mount -o nospace_cache,nodatasum > $XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 128K 0 128K" "$SCRATCH_MNT/foobar" |\ > _filter_xfs_io_offset > > -# step 2, corrupt the first 64k of one copy (on SCRATCH_DEV which is the first > -# one in $SCRATCH_DEV_POOL > +# step 2, corrupt the first 64k of stripe #1 > echo "step 2......corrupt file extent" >>$seqres.full > > ${FILEFRAG_PROG} -v $SCRATCH_MNT/foobar >> $seqres.full > logical_in_btrfs=`${FILEFRAG_PROG} -v $SCRATCH_MNT/foobar | _filter_filefrag | cut -d '#' -f 1` > -physical_on_scratch=`get_physical ${logical_in_btrfs}` > +physical=`get_physical ${logical_in_btrfs} 1` > +devid=$(get_devid ${logical_in_btrfs} 1) > +target_dev=$(get_device_path $devid) > > +SYSFS_BDEV=`_sysfs_dev $target_dev` > _scratch_unmount > -$XFS_IO_PROG -d -c "pwrite -S 0xbb -b 64K $physical_on_scratch 64K" $SCRATCH_DEV |\ > - _filter_xfs_io_offset > + > +echo "corrupt stripe 1 devid $devid devpath $target_dev physical $physical" \ > + >> $seqres.full > +$XFS_IO_PROG -d -c "pwrite -S 0xbb -b 64K $physical 64K" $target_dev > /dev/null > > _scratch_mount -o nospace_cache > > @@ -118,18 +140,18 @@ while [[ -z ${result} ]]; do > # invalidate the page cache. > _scratch_cycle_mount > > - start_fail > + start_fail $SYSFS_BDEV > result=$(bash -c " > if [[ \$((\$\$ % 2)) -eq 1 ]]; then > exec $XFS_IO_PROG -c \"pread 0 4K\" \"$SCRATCH_MNT/foobar\" > fi"); > - stop_fail > + stop_fail $SYSFS_BDEV > done > > _scratch_unmount > > # check if the repair works > -$XFS_IO_PROG -c "pread -v -b 512 $physical_on_scratch 512" $SCRATCH_DEV |\ > +$XFS_IO_PROG -c "pread -v -b 512 $physical 512" $target_dev |\ > _filter_xfs_io_offset > > _scratch_dev_pool_put > diff --git a/tests/btrfs/143.out b/tests/btrfs/143.out > index 66afea4b..a9e82ceb 100644 > --- a/tests/btrfs/143.out > +++ b/tests/btrfs/143.out > @@ -1,8 +1,6 @@ > QA output created by 143 > wrote 131072/131072 bytes > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > -wrote 65536/65536 bytes > -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ > -- > 2.23.0 > Other than that (which can probably be fixed at commit time by Eryu) it looks good and it solves the problem. Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx> Thanks. -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”