Re: [PATCH] generic: test COW writeback failure when overlapping non-shared blocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 21, 2021 at 03:09:49PM -0400, Brian Foster wrote:
> On Thu, Oct 21, 2021 at 11:40:05AM -0700, Darrick J. Wong wrote:
> > On Thu, Oct 21, 2021 at 12:39:59PM -0400, Brian Foster wrote:
> > > Test that COW writeback that overlaps non-shared delalloc blocks
> > > does not leave around stale delalloc blocks on I/O failure. This
> > > triggers assert failures and free space accounting corruption on
> > > XFS.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > > ---
> > > 
> > > This test targets the problem addressed by the following patch in XFS:
> > > 
> > > https://lore.kernel.org/linux-xfs/20211021163330.1886516-1-bfoster@xxxxxxxxxx/
> > > 
> > > Brian
> > > 
> > >  tests/generic/651     | 53 +++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/651.out |  2 ++
> > >  2 files changed, 55 insertions(+)
> > >  create mode 100755 tests/generic/651
> > >  create mode 100644 tests/generic/651.out
> > > 
> > > diff --git a/tests/generic/651 b/tests/generic/651
> > > new file mode 100755
> > > index 00000000..8d4e6728
> > > --- /dev/null
> > > +++ b/tests/generic/651
> > > @@ -0,0 +1,53 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2021 Red Hat, Inc.  All Rights Reserved.
> > > +#
> > > +# FS QA Test 651
> > > +#
> > > +# Test that COW writeback that overlaps non-shared delalloc blocks does not
> > > +# leave around stale delalloc blocks on I/O failure. This triggers assert
> > > +# failures and free space accounting corruption on XFS.
> > > +#
> > > +. ./common/preamble
> > > +_begin_fstest auto quick clone
> > > +
> > > +_cleanup()
> > > +{
> > > +	_cleanup_flakey
> > > +	cd /
> > > +	rm -r -f $tmp.*
> > > +}
> > > +
> > > +# Import common functions.
> > > +. ./common/reflink
> > > +. ./common/dmflakey
> > > +
> > > +# real QA test starts here
> > > +_supported_fs generic
> > > +_require_scratch_reflink
> > > +_require_flakey_with_error_writes
> > 
> > _require_cp_reflink
> > 
> > > +
> > > +_scratch_mkfs >> $seqres.full
> > > +_init_flakey
> > > +_mount_flakey
> > > +
> > > +# create two files that share a single block
> > > +$XFS_IO_PROG -fc "pwrite 4k 4k" $SCRATCH_MNT/file1 >> $seqres.full
> > 
> > Please use:
> > 
> > blksz=$(_get_file_block_size $SCRATCH_MNT)
> > $XFS_IO_PROG -fc "pwrite $blksz $blksz" $SCRATCH_MNT/file1 >> $seqres.full
> > 
> > So that this test will work properly on filesystems with bs > 4k.
> > 
> 
> Yeah, I'll fix the various hardcoded sizes. Thanks.
> 
> > > +cp --reflink $SCRATCH_MNT/file1 $SCRATCH_MNT/file2
> > 
> > Nit: This could be shortened to use the _cp_reflink helper, though it
> > doesn't really matter to me if you do.
> > 
> 
> Didn't know we had it. I'll look into it.
> 
> > > +# Perform a buffered write across the shared and non-shared blocks. On XFS, this
> > > +# creates a COW fork extent that covers the shared block as well as the just
> > 
> > Ah, the reason why there's a cow fork extent covering the delalloc
> > reservation is due to the default cow extent size hint, right?  In that
> > case, you need:
> > 
> 
> Yeah..
> 
> > _require_xfs_io_command "cowextsize"
> > $XFS_IO_PROG -c "cowextsize 0" $SCRATCH_MNT >> $seqres.full
> > 
> > to ensure that the speculative cow preallocation actually gets set up.
> > Otherwise, I think test won't reproduce the bug if the test config has
> > -d cowextsize=1 in the mkfs options.
> > 
> 
> .. but then we aren't susceptible to the problem, right?
> 
> I sometimes waffle on whether it's better for a test to create a
> problematic situation and test it, or run on the configuration specified
> by the user and test a particular scenario against that. Maybe the
> former makes more sense in this very specific test case, but then I
> suppose "cowextsize blksz*2" (or whatever large enough value) is

Yes, blksz*2.

> probably more robust than "cowextsize 0" (which I assume means "default"
> and thus can change, right)?

Seeing as this is a reproducer, explicitly setting cowextsize seems
appropriate.

Alternately, I suppose you could detect the one case where it won't
work (cowextsize == 1fsb) and only then change it.

--D

> 
> Brian
> 
> > > +# created non-shared delalloc block. Fail the writeback to verify that all
> > > +# delayed allocation is cleaned up properly.
> > > +_load_flakey_table $FLAKEY_ERROR_WRITES
> > > +$XFS_IO_PROG -c "pwrite 0 8k" -c fsync $SCRATCH_MNT/file2 >> $seqres.full
> > 
> > $((2 * blksz)), not 8k
> > 
> > Other than that, this looks reasonable to me.  I'll go look at the fix
> > patch now. :)
> > 
> > --D
> > 
> > > +_load_flakey_table $FLAKEY_ALLOW_WRITES
> > > +
> > > +# Try a post-fail reflink and then unmount. Both of these are known to produce
> > > +# errors and/or assert failures on XFS if we trip over a stale delalloc block.
> > > +cp --reflink $SCRATCH_MNT/file2 $SCRATCH_MNT/file3
> > > +_unmount_flakey
> > > +
> > > +# success, all done
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/651.out b/tests/generic/651.out
> > > new file mode 100644
> > > index 00000000..bd44c80c
> > > --- /dev/null
> > > +++ b/tests/generic/651.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 651
> > > +fsync: Input/output error
> > > -- 
> > > 2.31.1
> > > 
> > 
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux