On Wed, Oct 2, 2019 at 5:29 AM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > From: Josef Bacik <jbacik@xxxxxx> > > I discovered a problem in btrfs where we'd end up pointing at a block we > hadn't written out yet. This is triggered by a race when two different > files on two different subvolumes fsync. This test exercises this path > with dm-log-writes, and then replays the log at every FUA to verify the > file system is still mountable and the log is replayable. Please mention in the changelog, or in the test's description, the subject of the patch that fixes the problem. Otherwise people running the test and seeing failures will have a hard time figuring things out (think of QA teams for example, or anyone some weeks or months after, where context is lost and mailing list search is not that great). > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> > --- > tests/btrfs/194 | 102 ++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/194.out | 2 + > tests/btrfs/group | 1 + > 3 files changed, 105 insertions(+) > create mode 100755 tests/btrfs/194 > create mode 100644 tests/btrfs/194.out > > diff --git a/tests/btrfs/194 b/tests/btrfs/194 > new file mode 100755 > index 00000000..d5edb313 > --- /dev/null > +++ b/tests/btrfs/194 > @@ -0,0 +1,102 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2019 Facebook. All Rights Reserved. > +# > +# FS QA Test 194 > +# > +# Test multi subvolume fsync to test a bug where we'd end up pointing at a block > +# we haven't written. > +# > +# Will do log replay and check the filesystem. > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +fio_config=$tmp.fio This isn't used anywhere. > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + cd / > + $KILLALL_PROG -KILL -q $FSSTRESS_PROG &> /dev/null This line can go away, the test doesn't use fsstress. > + _log_writes_cleanup &> /dev/null > + _dmthin_cleanup > + rm -f $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmthin > +. ./common/dmlogwrites > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs generic > +_supported_os Linux > + > +# Use thin device as replay device, which requires $SCRATCH_DEV > +_require_scratch_nocheck > +# and we need extra device as log device > +_require_log_writes > +_require_dm_target thin-pool > + > +_require_fio All existing tests I could see, do "_require_fio $fio_config", where $fio_config is not empty. > + > +# Use a thin device to provide deterministic discard behavior. Discards are used > +# by the log replay tool for fast zeroing to prevent out-of-order replay issues. > +_test_unmount > +_dmthin_init $devsize $devsize $csize $lowspace > +_log_writes_init $DMTHIN_VOL_DEV > +_log_writes_mkfs >> $seqres.full 2>&1 > +_log_writes_mark mkfs > + > +_log_writes_mount > + > +# First create all the subvolumes > +for i in $(seq 0 49) > +do The style followed by fstests is "for i in $(seq 0 49); do", just like you did in the while loop below. > + $BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/sub$i" > /dev/null > +done > + > +# Then run 100 fio jobs in parallel > +for i in $(seq 0 49) > +do Same here. > + fio --name=seq --readwrite=write --fallocate=none --bs=4k --fsync=1 \ > + --size=64k --filename="$SCRATCH_MNT/sub$i/file" \ > + > /dev/null 2>&1 & fio -> $FIO_PROG And then $FIO_PROG $fio_config, as all other tests using fio do. > +done > +wait > +_log_writes_unmount > + > +_log_writes_remove > +prev=$(_log_writes_mark_to_entry_number mkfs) > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" > +cur=$(_log_writes_find_next_fua $prev) > +[ -z "$cur" ] && _fail "failed to locate next FUA write" > + > +while [ ! -z "$cur" ]; do > + _log_writes_replay_log_range $cur $DMTHIN_VOL_DEV >> $seqres.full > + > + # We need to mount the fs because btrfsck won't bother checking the log. > + _dmthin_mount > + _dmthin_check_fs > + > + prev=$cur > + cur=$(_log_writes_find_next_fua $(($cur + 1))) > + [ -z "$cur" ] && break > +done > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/194.out b/tests/btrfs/194.out > new file mode 100644 > index 00000000..7bfd50ff > --- /dev/null > +++ b/tests/btrfs/194.out > @@ -0,0 +1,2 @@ > +QA output created by 194 > +Silence is golden > diff --git a/tests/btrfs/group b/tests/btrfs/group > index b92cb12c..0d0e1bba 100644 > --- a/tests/btrfs/group > +++ b/tests/btrfs/group > @@ -196,3 +196,4 @@ > 191 auto quick send dedupe > 192 auto replay snapshot stress > 193 auto quick qgroup enospc limit > +194 auto metadata log volume Other than that it looks fine to me. However, on a VM with 8Gb of ram, I can't run the test: root 11:09:10 /home/fdmanana/git/hub/xfstests (master)> ./check btrfs/194 FSTYP -- btrfs PLATFORM -- Linux/x86_64 debian5 5.3.0-rc8-btrfs-next-58 #1 SMP Tue Sep 17 18:18:55 WEST 2019 MKFS_OPTIONS -- /dev/sdc MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1 btrfs/194 [failed, exit status 137]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/194.out.bad) --- tests/btrfs/194.out 2019-10-02 10:51:45.485575789 +0100 +++ /home/fdmanana/git/hub/xfstests/results//btrfs/194.out.bad 2019-10-02 11:12:11.382533722 +0100 @@ -1,2 +1,2 @@ QA output created by 194 -Silence is golden +./check: line 513: 1324 Killed bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq" ... (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/194.out /home/fdmanana/git/hub/xfstests/results//btrfs/194.out.bad' to see the entire diff) Ran: btrfs/194 Failures: btrfs/194 Failed 1 of 1 tests root 11:12:22 /home/fdmanana/git/hub/xfstests (master)> cat /home/fdmanana/git/hub/xfstests/results//btrfs/194.out.bad QA output created by 194 ./check: line 513: 1324 Killed bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq" root 11:19:39 /home/fdmanana/git/hub/xfstests (master)> 8Gb is a very reasonable amount of ram. This is on a debug kernel (kmemleak, kasan, debug_pagealloc, etc, enabled). Not uncommon among developers at least (even 4Gb of ram is common). Thanks. > -- > 2.21.0 > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”