Re: [PATCH] tests/generic: test writepage cached mapping validity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 26, 2017 at 11:34:02PM +0800, Eryu Guan wrote:
> On Thu, Oct 26, 2017 at 10:48:16AM -0400, Brian Foster wrote:
> > XFS has a bug where page writeback can end up sending data to the
> > wrong location due to a stale, cached file mapping. Add a test to
> > trigger this problem by racing background writeback with a
> > truncate/rewrite of the final page of the file.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> 
> Thanks a lot for the new test!
> 
> > ---
> > 
> > Here's a new version of the writepages test I previously posted as RFC.
> > This variant does not require an artificial delay to reproduce, so I've
> > dropped the need for the error injection tag.
> > 
> > I have been playing a bit with the file size and iteration count of the
> > test. I started with something that ran a decent bit longer (~2m) as was
> > necessary to reproduce on my dev/debug vm, but recently trimmed the file
> > size and iteration count to something that runs much quicker (~10s) and
> > reproduces nearly 100% of the time on my actual test hardware. The
> > tradeoff is the reproducibility is much lower on my debug vm (~20-25%
> > perhaps). The test still does reproduce when run over 10-15 iters, so I
> > opted for the quicker test.
> > 
> > In all, I am a bit curious about whether this reproduces reliably on
> > others' test setups. If not, does tweaking the size/iterations improve
> > the reproducibility?
> 
> On my test vm, with the default size/iteration numbers, the
> reproducibility is around 40%, run time is 3s. Then I doubled the
> ineration number, and it's 100% reproduced, run time is 7s.
> 
> On my real hardware, I have to double both file size and iteration
> numbers to reproduce, reproducibility is ~20%, run time 35s.
> 
> Note that the vm is running v4.14-rc5 based 'xfs-4.14-fixes-7' tag from
> Darric's tree and the real hardware is running v4.14-rc6.
> 

Thanks for testing this... It's interesting that you don't seem to
reproduce at all on the real hardware with the current values. What do
you have for storage on both of these setups? My VM is a slow, single
spindle while the hardware is also spinning rust but on a hardware raid.

If I run with 64MB, 32 iters, I'm at ~48 seconds on the VM. I can check
on bare metal as soon as the test run I have currently running
completes.

Brian

> Thanks,
> Eryu
> 
> > 
> > Brian
> > 
> > v1:
> > - New test algorithm that does not require artificial delay.
> > - Created as generic test.
> > rfc: https://marc.info/?l=linux-xfs&m=150886719725497&w=2
> > 
> >  tests/generic/999     | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/999.out |  2 ++
> >  tests/generic/group   |  1 +
> >  3 files changed, 97 insertions(+)
> >  create mode 100755 tests/generic/999
> >  create mode 100644 tests/generic/999.out
> > 
> > diff --git a/tests/generic/999 b/tests/generic/999
> > new file mode 100755
> > index 0000000..9e56a1e
> > --- /dev/null
> > +++ b/tests/generic/999
> > @@ -0,0 +1,94 @@
> > +#! /bin/bash
> > +# FS QA Test 999
> > +#
> > +# Test XFS page writeback code for races with the cached file mapping. XFS
> > +# caches the file -> block mapping for a full extent once it is initially looked
> > +# up. The cached mapping is used for all subsequent pages in the same writeback
> > +# cycle that cover the associated extent. Under certain conditions, it is
> > +# possible for concurrent operations on the file to invalidate the cached
> > +# mapping without the knowledge of writeback. Writeback ends up sending I/O to a
> > +# partly stale mapping and potentially leaving delalloc blocks in the current
> > +# mapping unconverted.
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> > +#-----------------------------------------------------------------------
> > +#
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1	# failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +
> > +# remove previous $seqres.full before test
> > +rm -f $seqres.full
> > +
> > +# real QA test starts here
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_scratch
> > +_require_test_program "feature"
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
> > +_scratch_mount || _fail "mount failed"
> > +
> > +file=$SCRATCH_MNT/file
> > +filesize=$((1024 * 1024 * 32))
> > +pagesize=`src/feature -s`
> > +truncsize=$((filesize - pagesize))
> > +
> > +for i in $(seq 0 15); do
> > +	# Truncate the file and fsync to persist the final size on-disk. This is
> > +	# required so the subsequent truncate will not wait on writeback.
> > +	$XFS_IO_PROG -fc "truncate 0" $file
> > +	$XFS_IO_PROG -c "truncate $filesize" -c fsync $file
> > +
> > +	# create a small enough delalloc extent to likely be contiguous
> > +	$XFS_IO_PROG -c "pwrite 0 $filesize" $file >> $seqres.full 2>&1
> > +
> > +	# Start writeback and a racing truncate and rewrite of the final page.
> > +	$XFS_IO_PROG -c "sync_range -w 0 0" $file &
> > +	sync_pid=$!
> > +	$XFS_IO_PROG -c "truncate $truncsize" \
> > +		     -c "pwrite $truncsize $pagesize" $file >> $seqres.full 2>&1
> > +
> > +	# If the test fails, the most likely outcome is an sb_fdblocks mismatch
> > +	# and/or an associated delalloc assert failure on inode reclaim. Cycle
> > +	# the mount to trigger detection.
> > +	wait $sync_pid
> > +	_scratch_cycle_mount || _fail "mount failed"
> > +done
> > +
> > +echo Silence is golden
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/generic/999.out b/tests/generic/999.out
> > new file mode 100644
> > index 0000000..3b276ca
> > --- /dev/null
> > +++ b/tests/generic/999.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 999
> > +Silence is golden
> > diff --git a/tests/generic/group b/tests/generic/group
> > index fbe0a7f..89342da 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -468,3 +468,4 @@
> >  463 auto quick clone dangerous
> >  464 auto rw
> >  465 auto rw quick aio
> > +999 auto quick
> > -- 
> > 2.9.5
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe fstests" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux