Re: [PATCH] generic: add test for missing btrfs csums in log when doing async on subpage vol

Zorro Lang <zlang@xxxxxxxxxx> · Tue, 29 Oct 2024 13:11:41 +0800

On Mon, Oct 28, 2024 at 02:57:28PM -0700, Darrick J. Wong wrote:
> On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
> > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> > race could mean that csums weren't getting written to the log tree,
> > leading to corruption when it was replayed.
> > 
> > The patches to detect log this tree corruption are in btrfs-progs 6.11.
> > 
> > Signed-off-by: Mark Harmstone <maharmstone@xxxxxx>
> > ---
> > This is a genericized version of the test I originally proposed as
> > btrfs/333.
> > 
> >  tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/757.out |  2 ++
> >  2 files changed, 73 insertions(+)
> >  create mode 100755 tests/generic/757
> >  create mode 100644 tests/generic/757.out
> > 
> > diff --git a/tests/generic/757 b/tests/generic/757
> > new file mode 100755
> > index 00000000..6ad3d01e
> > --- /dev/null
> > +++ b/tests/generic/757
> > @@ -0,0 +1,71 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# FS QA Test 757
> > +#
> > +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> > +# weren't getting written to the log tree, causing corruptions on remount.
> > +# This can be seen on subpage FSes on Linux 6.4.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick metadata log recoveryloop
> > +
> > +_fixed_by_kernel_commit e917ff56c8e7 \
> > +	"btrfs: determine synchronous writers from bio or writeback control"
> > +
> > +fio_config=$tmp.fio
> > +
> > +. ./common/dmlogwrites
> > +
> > +_require_scratch
> > +_require_log_writes
> > +
> > +cat >$fio_config <<EOF
> > +[global]
> > +iodepth=128
> > +direct=1
> > +ioengine=libaio
> > +rw=randwrite
> > +runtime=1s
> > +[job0]
> > +rw=randwrite
> > +filename=$SCRATCH_MNT/file
> > +size=1g
> > +fdatasync=1
> > +EOF
> > +
> > +_require_fio $fio_config
> > +
> > +cat $fio_config >> $seqres.full
> > +
> > +_log_writes_init $SCRATCH_DEV
> > +_log_writes_mkfs >> $seqres.full 2>&1
> > +_log_writes_mark mkfs
> > +
> > +_log_writes_mount
> > +
> > +$FIO_PROG $fio_config > /dev/null 2>&1
> > +_log_writes_unmount
> > +
> > +_log_writes_remove
> > +
> > +prev=$(_log_writes_mark_to_entry_number mkfs)
> > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
> > +cur=$(_log_writes_find_next_fua $prev)
> > +[ -z "$cur" ] && _fail "failed to locate next FUA write"
> > +
> > +while [ ! -z "$cur" ]; do
> > +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> > +
> > +	_check_scratch_fs
> 
> This test fails on xfs because (afaict) replaying the log to $cur
> results in $SCRATCH_DEV being a filesystem with a dirty log; and
> xfs_repair fails when it is given a filesystem with a dirty log.
> 
> I then fixed the test to mount and unmount the filesystem to recovery
> the dirty log before invoking xfs_repair:
> 
> 	# xfs_repair won't run if the log is dirty
> 	if [ $FSTYP = "xfs" ]; then
> 		_scratch_mount
> 		_scratch_unmount
> 	fi

Thanks Darrick, you're right.
I'm wondering can we always do a mount&unmount at here, no matter the
$FSTYP, if that doesn't affect the testing of other filesystems?

> 	_check_scratch_fs
> 
> But now the test takes a very long time to run because (on my system
> anyway) the fio run can initiate 17,000 FUAs, which means that this loop
> runs that many times.  100 iterations takes about 45 seconds, which is
> about two hours.
> 
> Is it necessary to iterate the loop that many times to reproduce
> whatever issue btrfs had?

Yes, it takes much long time on my side too:
 FSTYP         -- ext4
 PLATFORM      -- Linux/x86_64 dell-per750-47 6.12.0-rc4+ #1 SMP PREEMPT_DYNAMIC Fri Oct 25 14:25:45 EDT 2024
 MKFS_OPTIONS  -- -F /dev/sda4
 MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

 generic/757        4247s
 Ran: generic/757
 Passed all 1 tests

So better to reduce the testing time as much as possible, and remove it
from the "quick" group. (Maybe we can have a tag to mark those cases need
much long time too).

This patch has been merged into for-next branch, as:

  cf97fa373 generic: add test for missing btrfs csums in log when doing async on subpage vol

Please send another (or other two) patch to fix above 2 problems.

Thanks,
Zorro

> 
> --D
> 
> > +
> > +	prev=$cur
> > +	cur=$(_log_writes_find_next_fua $(($cur + 1)))
> > +	[ -z "$cur" ] && break
> > +done
> > +
> > +echo "Silence is golden"
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/generic/757.out b/tests/generic/757.out
> > new file mode 100644
> > index 00000000..dfbc8094
> > --- /dev/null
> > +++ b/tests/generic/757.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 757
> > +Silence is golden
> > -- 
> > 2.44.2
> > 
> > 
>