Re: [PATCH] generic: test COW writeback failure when overlapping non-shared blocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 21, 2021 at 11:40:05AM -0700, Darrick J. Wong wrote:
> On Thu, Oct 21, 2021 at 12:39:59PM -0400, Brian Foster wrote:
> > Test that COW writeback that overlaps non-shared delalloc blocks
> > does not leave around stale delalloc blocks on I/O failure. This
> > triggers assert failures and free space accounting corruption on
> > XFS.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > ---
> > 
> > This test targets the problem addressed by the following patch in XFS:
> > 
> > https://lore.kernel.org/linux-xfs/20211021163330.1886516-1-bfoster@xxxxxxxxxx/
> > 
> > Brian
> > 
> >  tests/generic/651     | 53 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/651.out |  2 ++
> >  2 files changed, 55 insertions(+)
> >  create mode 100755 tests/generic/651
> >  create mode 100644 tests/generic/651.out
> > 
> > diff --git a/tests/generic/651 b/tests/generic/651
> > new file mode 100755
> > index 00000000..8d4e6728
> > --- /dev/null
> > +++ b/tests/generic/651
> > @@ -0,0 +1,53 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2021 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test 651
> > +#
> > +# Test that COW writeback that overlaps non-shared delalloc blocks does not
> > +# leave around stale delalloc blocks on I/O failure. This triggers assert
> > +# failures and free space accounting corruption on XFS.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick clone
> > +
> > +_cleanup()
> > +{
> > +	_cleanup_flakey
> > +	cd /
> > +	rm -r -f $tmp.*
> > +}
> > +
> > +# Import common functions.
> > +. ./common/reflink
> > +. ./common/dmflakey
> > +
> > +# real QA test starts here
> > +_supported_fs generic
> > +_require_scratch_reflink
> > +_require_flakey_with_error_writes
> 
> _require_cp_reflink
> 
> > +
> > +_scratch_mkfs >> $seqres.full
> > +_init_flakey
> > +_mount_flakey
> > +
> > +# create two files that share a single block
> > +$XFS_IO_PROG -fc "pwrite 4k 4k" $SCRATCH_MNT/file1 >> $seqres.full
> 
> Please use:
> 
> blksz=$(_get_file_block_size $SCRATCH_MNT)
> $XFS_IO_PROG -fc "pwrite $blksz $blksz" $SCRATCH_MNT/file1 >> $seqres.full
> 
> So that this test will work properly on filesystems with bs > 4k.
> 

Yeah, I'll fix the various hardcoded sizes. Thanks.

> > +cp --reflink $SCRATCH_MNT/file1 $SCRATCH_MNT/file2
> 
> Nit: This could be shortened to use the _cp_reflink helper, though it
> doesn't really matter to me if you do.
> 

Didn't know we had it. I'll look into it.

> > +# Perform a buffered write across the shared and non-shared blocks. On XFS, this
> > +# creates a COW fork extent that covers the shared block as well as the just
> 
> Ah, the reason why there's a cow fork extent covering the delalloc
> reservation is due to the default cow extent size hint, right?  In that
> case, you need:
> 

Yeah..

> _require_xfs_io_command "cowextsize"
> $XFS_IO_PROG -c "cowextsize 0" $SCRATCH_MNT >> $seqres.full
> 
> to ensure that the speculative cow preallocation actually gets set up.
> Otherwise, I think test won't reproduce the bug if the test config has
> -d cowextsize=1 in the mkfs options.
> 

.. but then we aren't susceptible to the problem, right?

I sometimes waffle on whether it's better for a test to create a
problematic situation and test it, or run on the configuration specified
by the user and test a particular scenario against that. Maybe the
former makes more sense in this very specific test case, but then I
suppose "cowextsize blksz*2" (or whatever large enough value) is
probably more robust than "cowextsize 0" (which I assume means "default"
and thus can change, right)?

Brian

> > +# created non-shared delalloc block. Fail the writeback to verify that all
> > +# delayed allocation is cleaned up properly.
> > +_load_flakey_table $FLAKEY_ERROR_WRITES
> > +$XFS_IO_PROG -c "pwrite 0 8k" -c fsync $SCRATCH_MNT/file2 >> $seqres.full
> 
> $((2 * blksz)), not 8k
> 
> Other than that, this looks reasonable to me.  I'll go look at the fix
> patch now. :)
> 
> --D
> 
> > +_load_flakey_table $FLAKEY_ALLOW_WRITES
> > +
> > +# Try a post-fail reflink and then unmount. Both of these are known to produce
> > +# errors and/or assert failures on XFS if we trip over a stale delalloc block.
> > +cp --reflink $SCRATCH_MNT/file2 $SCRATCH_MNT/file3
> > +_unmount_flakey
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/generic/651.out b/tests/generic/651.out
> > new file mode 100644
> > index 00000000..bd44c80c
> > --- /dev/null
> > +++ b/tests/generic/651.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 651
> > +fsync: Input/output error
> > -- 
> > 2.31.1
> > 
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux