On Thu, Sep 29, 2022 at 01:19:25PM +0900, Naohiro Aota wrote: > A ZNS device limits the number of active zones, which is the number of > zones can be written at the same time. To deal with the limit, btrfs's > zoned mode tracks which zone (corresponds to a block group on the SINGLE > profile) is active, and finish a zone if necessary. > > This test checks if the active zone tracking and the finishing of zones > works properly. First, it fills <number of max active zones> zones > mostly. And, run some data/metadata stress workload to force btrfs to use a > new zone. > > This test fails on an older kernel (e.g, 5.18.2) like below. > > btrfs/292 > [failed, exit status 1]- output mismatch (see /host/btrfs/292.out.bad) > --- tests/btrfs/292.out 2022-09-15 07:52:18.000000000 +0000 > +++ /host/btrfs/292.out.bad 2022-09-15 07:59:14.290967793 +0000 > @@ -1,2 +1,5 @@ > QA output created by 292 > -Silence is golden > +stress_data_bgs failed > +stress_data_bgs_2 failed > +failed: '/bin/btrfs subvolume snapshot /mnt/scratch /mnt/scratch/snap825' > +(see /host/btrfs/292.full for details) > ... > (Run 'diff -u /var/lib/xfstests/tests/btrfs/292.out /host/btrfs/292.out.bad' to see the entire diff) > > The failure is fixed with a series "btrfs: zoned: fix active zone tracking > issues" [1] (upstream commits from 65ea1b66482f ("block: add bdev_max_segments() > helper") to 2ce543f47843 ("btrfs: zoned: wait until zone is finished when > allocation didn't progress")). If this's a regression test case for known fix, we'd better to use: _fixed_by_kernel_commit 65ea1b66482f block: add bdev_max_segments (patchset) to remind downstream testers about that known issue. > > [1] https://lore.kernel.org/linux-btrfs/cover.1657321126.git.naohiro.aota@xxxxxxx/ > > Signed-off-by: Naohiro Aota <naohiro.aota@xxxxxxx> > --- > common/zoned | 11 ++++ > tests/btrfs/292 | 136 ++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/292.out | 2 + > 3 files changed, 149 insertions(+) > create mode 100755 tests/btrfs/292 > create mode 100644 tests/btrfs/292.out > > diff --git a/common/zoned b/common/zoned > index d1bc60f784a1..eed0082a15cf 100644 > --- a/common/zoned > +++ b/common/zoned > @@ -15,6 +15,17 @@ _filter_blkzone_report() > sed -e 's/len/cap/2' > } > > +_require_limited_active_zones() { > + local dev=$1 > + local sysfs=$(_sysfs_dev ${dev}) > + local attr="${sysfs}/queue/max_active_zones" > + > + [ -e "${attr}" ] || _notrun "cannot find queue/max_active_zones. Maybe non-zoned device?" > + if [ $(cat "${attr}") == 0 ]; then > + _notrun "this test requires limited active zones" > + fi > +} > + > _zone_capacity() { > local phy=$1 > local dev=$2 > diff --git a/tests/btrfs/292 b/tests/btrfs/292 > new file mode 100755 > index 000000000000..6cfd6b18c299 > --- /dev/null > +++ b/tests/btrfs/292 > @@ -0,0 +1,136 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2022 Western Digital Corporation. All Rights Reserved. > +# > +# FS QA Test 292 > +# > +# Test that an active zone is properly reclaimed to allow the further > +# allocations, even if the active zones are mostly filled. > +# > +. ./common/preamble > +_begin_fstest auto quick snapshot zone > + > +# Import common functions. > +. ./common/btrfs It's not necessary, due to fs specified common file is included automatically by _source_specific_fs() according to $FSTYP. > +. ./common/zoned > + > +# real QA test starts here > + > +_supported_fs btrfs > +_require_scratch > +_require_zoned_device "$SCRATCH_DEV" > +_require_limited_active_zones "$SCRATCH_DEV" > + > +_require_command "$BLKZONE_PROG" blkzone > +_require_btrfs_command inspect-internal dump-tree > + > +# This test requires specific data space usage, skip if we have compression > +# enabled. > +_require_no_compress > + > +max_active=$(cat $(_sysfs_dev ${SCRATCH_DEV})/queue/max_active_zones) > + > +# Fill the zones leaving the last 1MB > +fill_active_zones() { > + # Asuumes we have the same capacity between zones. > + local capacity=$(_zone_capacity 0) > + local fill_size=$((capacity - 1024 * 1024)) > + > + for x in $(seq ${max_active}); do > + dd if=/dev/zero of=${SCRATCH_MNT}/fill$(printf "%02d" $x) bs=${fill_size} \ What kind of indentation do you use? 4 spaces? 2 spaces? or a tab? Although that's not a big deal, 8 characters width tab is recommended in fstests :) > + count=1 oflag=direct 2>/dev/null > + $BTRFS_UTIL_PROG filesystem sync ${SCRATCH_MNT} > + > + local nactive=$($BLKZONE_PROG report ${SCRATCH_DEV} | grep oi | wc -l) > + if [[ ${nactive} == ${max_active} ]]; then > + break > + fi > + done > + > + echo "max active zones: ${max_active}" >> $seqres.full > + $BLKZONE_PROG report ${SCRATCH_DEV} | grep oi | cat -n >> $seqres.full > +} > + > +workout() { > + local func="$1" > + > + _scratch_mkfs >/dev/null 2>&1 > + _scratch_mount > + > + fill_active_zones > + eval "$func" > + local ret=$? > + > + _scratch_unmount > + _check_btrfs_filesystem ${SCRATCH_DEV} > + > + return $ret > +} > + > +stress_data_bgs() { > + # This dd fails with ENOSPC, which should not :( > + dd if=/dev/zero of=${SCRATCH_MNT}/large bs=64M count=1 oflag=sync \ > + >>$seqres.full 2>&1 > +} > + > +stress_data_bgs_2() { > + # This dd fails with ENOSPC, which should not :( > + dd if=/dev/zero of=${SCRATCH_MNT}/large bs=64M count=10 conv=fsync \ > + >>$seqres.full 2>&1 & > + local pid1=$! > + > + dd if=/dev/zero of=${SCRATCH_MNT}/large2 bs=64M count=10 conv=fsync \ > + >>$seqres.full 2>&1 & If we used background processes, we'd better to deal with them in cleanup time. But above two background processes are simple enough, I think you can add a "wait" in _cleanup to make sure these backgroud processes are done. Or you'd like to remove the "local" for pid1 and pid2, then does below in _cleanup: _cleanup() { [ -n "$pid1" ] && kill $pid1 [ -n "$pid2" ] && kill $pid2 wait ... } And ... > + local pid2=$! > + > + wait $pid1; local ret1=$? unset pid1 > + wait $pid2; local ret2=$? unset pid2 At here > + > + if [ $ret1 -ne 0 -o $ret2 -ne 0 ]; then > + return 1 > + fi > + return 0 > +} > + > +get_meta_bgs() { > + $BTRFS_UTIL_PROG inspect-internal dump-tree -t EXTENT ${SCRATCH_DEV} | > + grep BLOCK_GROUP -A 1 |grep -B1 'METADATA|' | > + grep -oP '\(\d+ BLOCK_GROUP_ITEM \d+\)' > +} > + > +# This test case does not return the result because > +# _run_btrfs_util_prog will call _fail() in the error case anyway. > +stress_metadata_bgs() { > + local metabgs=$(get_meta_bgs) > + local count=0 > + > + while : ; do > + _run_btrfs_util_prog subvolume snapshot ${SCRATCH_MNT} ${SCRATCH_MNT}/snap$i > + _run_btrfs_util_prog filesystem sync ${SCRATCH_MNT} > + cur_metabgs=$(get_meta_bgs) > + if [[ "${cur_metabgs}" != "${metabgs}" ]]; then > + break > + fi > + i=$((i + 1)) > + done > +} > + > +WORKS=( > + stress_data_bgs > + stress_data_bgs_2 > + stress_metadata_bgs > +) > + > +status=0 > +for work in "${WORKS[@]}"; do > + if ! workout "${work}"; then > + echo "${work} failed" > + status=1 > + fi > +done The $status is 1 at the beginning of each case, and we set it to 0 at the end of each case. If a test case _fail exit in the middle, then status keep be 1. For your case, I think you don't need to touch the $status, you _fail directly if anyone *worktout* returns failure. Or just let the "${work} failed" output breaks the golden image(.out) to cause a test failure (no matter the status=0/1). > + > +# success, all done > +if [ $status -eq 0 ]; then > + echo "Silence is golden" > +fi You can output "Silence is golden" at here directly, due to this case expect that "Silence". I'm not the best reviewer for a zoned&btrfs related case, so I tried to review from fstests side. I saw Johannes Thumshirn <johannes.thumshirn@xxxxxxx> has given you a RVB last time, I think you can keep that as a review from btrfs and zbd side (except he has more review points now). Thanks, Zorro > +exit > diff --git a/tests/btrfs/292.out b/tests/btrfs/292.out > new file mode 100644 > index 000000000000..627309d3fbd2 > --- /dev/null > +++ b/tests/btrfs/292.out > @@ -0,0 +1,2 @@ > +QA output created by 292 > +Silence is golden > -- > 2.37.3 >