On 18.03.25 11:28, Filipe Manana wrote: > On Mon, Mar 17, 2025 at 3:05 PM Johannes Thumshirn <jth@xxxxxxxxxx> wrote: >> >> From: Johannes Thumshirn <johannes.thumshirn@xxxxxxx> >> >> Recently we had a bug report about a kernel crash that happened when the >> user was converting a filesystem to use RAID1 for metadata, but for some >> reason the device's write pointers got out of sync. >> >> Test this scenario by manually injecting de-synchronized write pointer >> positions and then running conversion to a metadata RAID1 filesystem. >> >> In the testcase also repair the broken filesystem and check if both system >> and metadata block groups are back to the default 'DUP' profile >> afterwards. >> >> Link: https://lore.kernel.org/linux-btrfs/CAB_b4sBhDe3tscz=duVyhc9hNE+gu=B8CrgLO152uMyanR8BEA@xxxxxxxxxxxxxx/ >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@xxxxxxx> >> --- >> tests/btrfs/329 | 61 +++++++++++++++++++++++++++++++++++++++++++++ >> tests/btrfs/329.out | 3 +++ >> 2 files changed, 64 insertions(+) >> create mode 100755 tests/btrfs/329 >> create mode 100644 tests/btrfs/329.out >> >> diff --git a/tests/btrfs/329 b/tests/btrfs/329 >> new file mode 100755 >> index 000000000000..441be133e230 >> --- /dev/null >> +++ b/tests/btrfs/329 >> @@ -0,0 +1,61 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (c) 2025 Western Digital Corporation. All Rights Reserved. >> +# >> +# FS QA Test 329 >> +# >> +# what am I here for? > > Missing description here. Oops. > >> +# >> +. ./common/preamble >> +_begin_fstest zone quick volume >> + >> +. ./common/filter >> + >> +_fixed_by_kernel_commit XXXXXXXXXXXX \ >> + "btrfs: zoned: return EIO on RAID1 block group write pointer mismatch" >> + >> +_require_scratch_dev_pool 2 >> +declare -a devs="( $SCRATCH_DEV_POOL )" >> +_require_zoned_device ${devs[0]} >> +_require_zoned_device ${devs[1]} >> +_require_command "$BLKZONE_PROG" blkzone >> + >> +_scratch_mkfs >> $seqres.full 2>&1 > > || _fail "mkfs failed" > > That's what we do nowadays. > >> +_scratch_mount >> + >> +# Write some data to the FS to dirty it >> +dd if=/dev/zero of=$SCRATCH_MNT/test bs=128k count=1024 >> $seqres.full 2>&1 > > Is there any reason to use dd? > > A simple: > > $XFS_IO_PROG -f "pwrite 0 128M" $SCRATCH_MNT/test > > would do it and it's easier to read (don't have to do the > multiplication to figure out it's 128M of data). > > Also, why redirect both stdout and stderr to the .full file? > We want to detect errors and fail with a golden output mismatch rather > than silently ignore errors, so at least don't redirect stderr. > I see this pattern done everywhere in the test case, effectively > ignoring errors. Right, xfs_io will be a lot easier. > > >> + >> +# Add device two to the FS >> +$BTRFS_UTIL_PROG device add ${devs[1]} $SCRATCH_MNT >> $seqres.full 2>&1 > > Same here. > >> + >> +# Move write pointers of all empty zones by 4k to simulate write pointer >> +# mismatch. >> +# 'blkzone report' reports the zone numbers in sectors so we need to convert >> +# it to bytes first. Afterwards we need to convert it to 4k blocks for dd. >> +zones=$($BLKZONE_PROG report ${devs[1]} | $AWK_PROG '/em/ { print $2 }' |\ >> + sed 's/,//') >> +for zone in $zones; >> +do >> + zone=$(($zone / 8)) >> + dd if=/dev/zero of=${devs[1]} seek=$zone bs=4k oflag=direct \ >> + count=1 >> $seqres.full 2>&1 > > Same here. > > $XFS_IO_PROG can also be used. > And since this is using odirect, we should have a _require_odirect above. The O_DIRECT is for the block device file, not the underlying filesystem and it is used the inject zone write pointer mismatches by bypassing the FS. So while I can add _require_odirect, it doesn't really matter for this case. > > >> +done >> + >> +# expected to fail >> +$BTRFS_UTIL_PROG balance start -mconvert=raid1 $SCRATCH_MNT \ >> + >> $seqres.full 2>&1 > > Expected to fail, but the test is not checking if it fails at all - if > it doesn't fail it doesn't detect it all. > So don't redirect stderr and include stderr output in the golden output file. > >> + >> +_scratch_unmount >> + >> +$MOUNT_PROG -t btrfs -odegraded ${devs[0]} $SCRATCH_MNT >> + >> +$BTRFS_UTIL_PROG device remove --force missing $SCRATCH_MNT >> $seqres.full 2>&1 >> +$BTRFS_UTIL_PROG balance start --full-balance $SCRATCH_MNT >> $seqres.full 2>&1 > > Same here, ignoring failures. > >> + >> +# Check that both System and Metadata are back to the DUP profile >> +$BTRFS_UTIL_PROG filesystem df /mnt/scratch/ |\ > > Please replace /mnt/scratch/ with $SCRATCH_MNT, otherwise the test > will not work on setups with a different scratch mount point (like > mine). Oops sure.