On Wed, Aug 26, 2020 at 5:38 PM Brian Foster <bfoster@xxxxxxxxxx> wrote: > > Several generic fstests use dm-log-writes to test the filesystem for > consistency at various crash recovery points. dm-log-writes and the > associated replay mechanism rely on zeroing via discard to clear > stale blocks when moving to various points in time of the fs. If the > storage doesn't provide zeroing or the discard requests exceed the > hardcoded maximum (128MB) of the fallback solution to physically > write zeroes, stale blocks are left around in the target fs. This > scheme is known to cause issues on XFS v5 superblocks if recovery > observes metadata from a future variant of an fs that has been > replayed to an older point in time. This corrupts the filesystem and > leads to false test failures. > > generic/482 already works around this problem by using a thin volume > as the target device, which provides consistent and efficient > discard zeroing behavior, but other tests have seen similar issues > on XFS. Add an XFS specific check to the dmlogwrites init time code > that requires discard zeroing support and otherwise skips the test > to avoid false positive failures. > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > --- > common/dmlogwrites | 10 ++++++++-- > common/rc | 14 ++++++++++++++ > tests/generic/470 | 2 +- > 3 files changed, 23 insertions(+), 3 deletions(-) > > diff --git a/common/dmlogwrites b/common/dmlogwrites > index 573f4b8a..92cc6ce2 100644 > --- a/common/dmlogwrites > +++ b/common/dmlogwrites > @@ -43,9 +43,10 @@ _require_log_writes_dax_mountopt() > _require_test_program "log-writes/replay-log" > > local ret=0 > - local mountopt=$1 > + local dev=$1 > + local mountopt=$2 > > - _log_writes_init $SCRATCH_DEV > + _log_writes_init $dev > _log_writes_mkfs > /dev/null 2>&1 > _log_writes_mount "-o $mountopt" > /dev/null 2>&1 > # Check options to be sure. > @@ -66,6 +67,11 @@ _log_writes_init() > [ -z "$blkdev" ] && _fail \ > "block dev must be specified for _log_writes_init" > > + # XFS requires discard zeroing support on the target device to work > + # reliably with dm-log-writes. Use dm-thin devices in tests that want > + # to provide reliable discard zeroing support. > + [ $FSTYP == "xfs" ] && _require_discard_zeroes $blkdev > + I imagine that ext4 could also be burned by this. Do we have a reason to limit this requirement to xfs? I prefer to make it generic. > local BLK_DEV_SIZE=`blockdev --getsz $blkdev` > LOGWRITES_NAME=logwrites-test > LOGWRITES_DMDEV=/dev/mapper/$LOGWRITES_NAME > diff --git a/common/rc b/common/rc > index aa5a7409..fedb5221 100644 > --- a/common/rc > +++ b/common/rc > @@ -4313,6 +4313,20 @@ _require_mknod() > rm -f $TEST_DIR/$seq.null > } > > +# check that discard is supported and subsequent reads return zeroes > +_require_discard_zeroes() > +{ > + local dev=$1 > + > + _require_command "$BLKDISCARD_PROG" blkdiscard > + > + $XFS_IO_PROG -c "pwrite -S 0xcd 0 4k" $dev > /dev/null 2>&1 || > + _fail "write error" > + $BLKDISCARD_PROG -o 0 -l 1m $dev || _notrun "no discard support" > + hexdump -n 4096 $dev | head -n 1 | grep cdcd && > + _notrun "no discard zeroing support" > +} > + I am fine with your solution, but if there was a discussion on the best way to solve the problem, I missed it, so would like to hear what Chritoph has to say. Thanks, Amir.