Re: [PATCH 8/8] xfs: test for premature ENOSPC with large cow delalloc extents

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 27, 2024 at 02:00:21PM +0800, Zorro Lang wrote:
> On Mon, Feb 26, 2024 at 06:02:37PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > On a higly fragmented filesystem a Direct IO write can fail with -ENOSPC error
> > even though the filesystem has sufficient number of free blocks.
> > 
> > This occurs if the file offset range on which the write operation is being
> > performed has a delalloc extent in the cow fork and this delalloc extent
> > begins much before the Direct IO range.
> > 
> > In such a scenario, xfs_reflink_allocate_cow() invokes xfs_bmapi_write() to
> > allocate the blocks mapped by the delalloc extent. The extent thus allocated
> > may not cover the beginning of file offset range on which the Direct IO write
> > was issued. Hence xfs_reflink_allocate_cow() ends up returning -ENOSPC.
> > 
> > This test addresses this issue.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> > ---
> >  common/rc          |   14 +++++++++
> >  tests/xfs/1923     |   85 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1923.out |    8 +++++
> >  3 files changed, 107 insertions(+)
> >  create mode 100755 tests/xfs/1923
> >  create mode 100644 tests/xfs/1923.out
> > 
> > 
> > diff --git a/common/rc b/common/rc
> > index 30c44dddd9..d3a2a0718b 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -1873,6 +1873,20 @@ _require_scratch_delalloc()
> >  	_scratch_unmount
> >  }
> >  
> > +# Require test fs supports delay allocation.
> > +_require_test_delalloc()
> > +{
> > +	_require_command "$FILEFRAG_PROG" filefrag
> > +
> > +	rm -f $TEST_DIR/testy
> > +	$XFS_IO_PROG -f -c 'pwrite 0 64k' $TEST_DIR/testy &> /dev/null
> > +	$FILEFRAG_PROG -v $TEST_DIR/testy 2>&1 | grep -q delalloc
> 
> I'm wondering if it's a 100% reliable way to get the "delalloc" flag when
> the delalloc is supported? If not, is it worth testing with several files
> or a loop, then return 0 if one of them get delalloc ?
> 
> > +	res=$?
> > +	rm -f $TEST_DIR/testy
> > +	test $res -eq 0 || \
> > +		_notrun "test requires delayed allocation buffered writes"
> > +}
> > +
> >  # this test needs a test partition - check we're ok & mount it
> >  #
> >  _require_test()
> > diff --git a/tests/xfs/1923 b/tests/xfs/1923
> > new file mode 100755
> > index 0000000000..4e494ad8c2
> > --- /dev/null
> > +++ b/tests/xfs/1923
> > @@ -0,0 +1,85 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2022-2024 Oracle.  All Rights Reserved.
> > +#
> > +# FS QA Test No. 1923
> > +#
> > +# This is a regression test for "xfs: Fix false ENOSPC when performing direct
> > +# write on a delalloc extent in cow fork".  If there is a lot of free space but
> > +# it is very fragmented, it's possible that a very large delalloc reservation
> > +# could be created in the CoW fork by a buffered write.  If a directio write
> > +# tries to convert the delalloc reservation to a real extent, it's possible
> > +# that the allocation will succeed but fail to convert even the first block of
> > +# the directio write range.  In this case, XFS will return ENOSPC even though
> > +# all it needed to do was to keep converting until the allocator returns ENOSPC
> > +# or the first block of the direct write got some space.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick clone
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -f $file1 $file2 $fragmentedfile
> > +}
> > +
> > +# Import common functions.
> > +. ./common/reflink
> > +. ./common/inject
> > +
> > +# real QA test starts here
> > +_fixed_by_kernel_commit XXXXX \
> > +	"xfs: Fix false ENOSPC when performing direct write on a delalloc extent in cow fork"
> 
> Is it this commit below, or a new fix with same subject?
> 
> commit d62113303d691bcd8d0675ae4ac63e7769afc56c
> Author: Chandan Babu R <chandan.babu@xxxxxxxxxx>
> Date:   Thu Aug 4 08:59:27 2022 -0700
> 
>     xfs: Fix false ENOSPC when performing direct write on a delalloc extent in cow fork
> 
> If it's an old commit, please replace the "XXXXX" to "d62113303d69".

Ah, yes, it is that one.  Sorry I forgot to update that.  As you can
tell, this is a really old test that ... I guess we never sent
upstream.

> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> 
> "xfs"? As it's in tests/xfs/ directory.

<nod>

> > +_require_test_program "punch-alternating"
> > +_require_test_reflink
> > +_require_xfs_io_error_injection "bmap_alloc_minlen_extent"
> > +_require_test_delalloc
> > +
> > +file1=$TEST_DIR/file1.$seq
> > +file2=$TEST_DIR/file2.$seq
> > +fragmentedfile=$TEST_DIR/fragmentedfile.$seq
> 
> As you use the $TEST_DIR, it might be worth making sure these files
> aren't existed. Due to the files (from other cases) in $TEST_DIR might
> not be cleaned, especially if they don't take much free space.

Ok.

> So if these 3 files won't take much space, we can keep them, don't need
> a specific _cleanup(). And move the "rm -f $file1 $file2 $fragmentedfile"
> at here.

Well you make a good point that we should remove those files so that
they end up in a known state (i.e. nonexistence) where we can't get
tripped up by someone creating, say, a /dev/sda symlink to
$TEST_DIR/file1.$seq.

I don't think the files are all that big, but I'll continue removing
them anyway.  Deletion is part of aging.

> > +
> > +# COW operates on pages, so we must not perform operations in units smaller
> > +# than a page.
> > +blksz=$(_get_file_block_size $TEST_DIR)
> > +pagesz=$(_get_page_size)
> > +if (( $blksz < $pagesz )); then
> > +	blksz=$pagesz
> > +fi
> 
> Just curious, this's a xfs specific test case, can xfs support blocksize >
> pagesize? If not, can we just use pagesz directly at here ?

It doesn't support bs > ps yet, but the point of clamping $blksz here
is the fact that we'll never try to writeback less than a page, so we
might as well run with those units.

(This test has nothing to do with the ongoing bs>ps effort.)

> > +
> > +echo "Create source file"
> > +$XFS_IO_PROG -f -c "pwrite 0 $((blksz * 256))" $file1 >> $seqres.full
> > +
> > +sync
> > +
> > +echo "Create Reflinked file"
> > +_cp_reflink $file1 $file2 >> $seqres.full
> > +#$XFS_IO_PROG -f -c "reflink $file1" $file2 >> $seqres.full
> 
> There's a "#", do you hope to run it or not?

Oops.  That's redundant with the _cp_reflink.

--D

> > +
> > +echo "Set cowextsize"
> > +$XFS_IO_PROG -c "cowextsize $((blksz * 128))" -c stat $file1 >> $seqres.full
> > +
> > +echo "Fragment FS"
> > +$XFS_IO_PROG -f -c "pwrite 0 $((blksz * 512))" $fragmentedfile >> $seqres.full
> > +sync
> > +$here/src/punch-alternating $fragmentedfile
> > +
> > +echo "Allocate block sized extent from now onwards"
> > +_test_inject_error bmap_alloc_minlen_extent 1
> > +
> > +echo "Create big delalloc extent in CoW fork"
> > +$XFS_IO_PROG -c "pwrite 0 $blksz" $file1 >> $seqres.full
> > +
> > +sync
> > +
> > +$XFS_IO_PROG -c 'bmap -elpv' -c 'bmap -celpv' $file1 >> $seqres.full
> > +
> > +echo "Direct I/O write at offset 3FSB"
> > +$XFS_IO_PROG -d -c "pwrite $((blksz * 3)) $((blksz * 2))" $file1 >> $seqres.full
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/xfs/1923.out b/tests/xfs/1923.out
> > new file mode 100644
> > index 0000000000..a0553cf3ee
> > --- /dev/null
> > +++ b/tests/xfs/1923.out
> > @@ -0,0 +1,8 @@
> > +QA output created by 1923
> > +Create source file
> > +Create Reflinked file
> > +Set cowextsize
> > +Fragment FS
> > +Allocate block sized extent from now onwards
> > +Create big delalloc extent in CoW fork
> > +Direct I/O write at offset 3FSB
> > 
> 
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux