Re: [PATCH] fstests: btrfs: zoned: verify RAID conversion with write pointer mismatch

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Mon, Mar 17, 2025 at 3:05 PM Johannes Thumshirn <jth@xxxxxxxxxx> wrote:
>
> From: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
>
> Recently we had a bug report about a kernel crash that happened when the
> user was converting a filesystem to use RAID1 for metadata, but for some
> reason the device's write pointers got out of sync.
>
> Test this scenario by manually injecting de-synchronized write pointer
> positions and then running conversion to a metadata RAID1 filesystem.
>
> In the testcase also repair the broken filesystem and check if both system
> and metadata block groups are back to the default 'DUP' profile
> afterwards.
>
> Link: https://lore.kernel.org/linux-btrfs/CAB_b4sBhDe3tscz=duVyhc9hNE+gu=B8CrgLO152uMyanR8BEA@xxxxxxxxxxxxxx/
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
> ---
>  tests/btrfs/329     | 61 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/329.out |  3 +++
>  2 files changed, 64 insertions(+)
>  create mode 100755 tests/btrfs/329
>  create mode 100644 tests/btrfs/329.out
>
> diff --git a/tests/btrfs/329 b/tests/btrfs/329
> new file mode 100755
> index 000000000000..441be133e230
> --- /dev/null
> +++ b/tests/btrfs/329
> @@ -0,0 +1,61 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 Western Digital Corporation.  All Rights Reserved.
> +#
> +# FS QA Test 329
> +#
> +# what am I here for?

Missing description here.

> +#
> +. ./common/preamble
> +_begin_fstest zone quick volume
> +
> +. ./common/filter
> +
> +_fixed_by_kernel_commit XXXXXXXXXXXX \
> +       "btrfs: zoned: return EIO on RAID1 block group write pointer mismatch"
> +
> +_require_scratch_dev_pool 2
> +declare -a devs="( $SCRATCH_DEV_POOL )"
> +_require_zoned_device ${devs[0]}
> +_require_zoned_device ${devs[1]}
> +_require_command "$BLKZONE_PROG" blkzone
> +
> +_scratch_mkfs >> $seqres.full 2>&1

|| _fail "mkfs failed"

That's what we do nowadays.

> +_scratch_mount
> +
> +# Write some data to the FS to dirty it
> +dd if=/dev/zero of=$SCRATCH_MNT/test bs=128k count=1024 >> $seqres.full 2>&1

Is there any reason to use dd?

A simple:

$XFS_IO_PROG -f "pwrite 0 128M" $SCRATCH_MNT/test

would do it and it's easier to read (don't have to do the
multiplication to figure out it's 128M of data).

Also, why redirect both stdout and stderr to the .full file?
We want to detect errors and fail with a golden output mismatch rather
than silently ignore errors, so at least don't redirect stderr.
I see this pattern done everywhere in the test case, effectively
ignoring errors.


> +
> +# Add device two to the FS
> +$BTRFS_UTIL_PROG device add ${devs[1]} $SCRATCH_MNT >> $seqres.full 2>&1

Same here.

> +
> +# Move write pointers of all empty zones by 4k to simulate write pointer
> +# mismatch.
> +# 'blkzone report' reports the zone numbers in sectors so we need to convert
> +# it to bytes first. Afterwards we need to convert it to 4k blocks for dd.
> +zones=$($BLKZONE_PROG report ${devs[1]} | $AWK_PROG '/em/ { print $2 }' |\
> +       sed 's/,//')
> +for zone in $zones;
> +do
> +       zone=$(($zone / 8))
> +       dd if=/dev/zero of=${devs[1]} seek=$zone bs=4k oflag=direct \
> +               count=1 >> $seqres.full 2>&1

Same here.

$XFS_IO_PROG can also be used.
And since this is using odirect, we should have a _require_odirect above.


> +done
> +
> +# expected to fail
> +$BTRFS_UTIL_PROG balance start -mconvert=raid1 $SCRATCH_MNT \
> +       >> $seqres.full 2>&1

Expected to fail, but the test is not checking if it fails at all - if
it doesn't fail it doesn't detect it all.
So don't redirect stderr and include stderr output in the golden output file.

> +
> +_scratch_unmount
> +
> +$MOUNT_PROG -t btrfs -odegraded ${devs[0]} $SCRATCH_MNT
> +
> +$BTRFS_UTIL_PROG device remove --force missing $SCRATCH_MNT >> $seqres.full 2>&1
> +$BTRFS_UTIL_PROG balance start --full-balance $SCRATCH_MNT >> $seqres.full 2>&1

Same here, ignoring failures.

> +
> +# Check that both System and Metadata are back to the DUP profile
> +$BTRFS_UTIL_PROG filesystem df /mnt/scratch/ |\

Please replace /mnt/scratch/ with $SCRATCH_MNT, otherwise the test
will not work on setups with a different scratch mount point (like
mine).

Thanks.

> +       grep -o -e "System, DUP" -e "Metadata, DUP"
> +
> +status=0
> +exit
> diff --git a/tests/btrfs/329.out b/tests/btrfs/329.out
> new file mode 100644
> index 000000000000..eab11130981d
> --- /dev/null
> +++ b/tests/btrfs/329.out
> @@ -0,0 +1,3 @@
> +QA output created by 329
> +System, DUP
> +Metadata, DUP
> --
> 2.43.0
>
>





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux