Re: [PATCH 3/5] fstests: add fsstress + compaction test

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Wed, Jun 12, 2024 at 04:00:48PM +0800, Zorro Lang wrote:
> On Mon, Jun 10, 2024 at 08:02:00PM -0700, Luis Chamberlain wrote:
> > Running compaction while we run fsstress can crash older kernels as per
> > korg#218227 [0], the fix for that [0] has been posted [1] that patch
> > was merged on v6.9-rc6 fixed by commit d99e3140a4d3 ("mm: turn
> > folio_test_hugetlb into a PageType"). However even on v6.10-rc2 where
> > this kernel commit is already merged we can still deadlock when running
> > fsstress and at the same time triggering compaction, this is a new
> > issue being reported now this through patch, but this patch also
> > serves as a reproducer with a high confidence. It always deadlocks.
> > If you enable CONFIG_PROVE_LOCKING with the defaults you will end up
> > with a complaint about increasing MAX_LOCKDEP_CHAIN_HLOCKS [1], if
> > you adjust that you then end up with a few soft lockup complaints and
> > some possible deadlock candidates to evaluate [2].
> > 
> > Provide a simple reproducer and pave the way so we keep on testing this.
> > 
> > Without lockdep enabled we silently deadlock on the first run of the
> > test without the fix applied. With lockdep enabled you get a splat about
> > the possible deadlock on the first run of the test.
> > 
> > [0] https://bugzilla.kernel.org/show_bug.cgi?id=218227
> > [1] https://gist.github.com/mcgrof/824913b645892214effeb1631df75072
> > [2] https://gist.github.com/mcgrof/926e183d21c5c4c55d74ec90197bd77a
> > 
> > Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> > ---
> >  common/rc             |  7 +++++
> >  tests/generic/750     | 62 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/750.out |  2 ++
> >  3 files changed, 71 insertions(+)
> >  create mode 100755 tests/generic/750
> >  create mode 100644 tests/generic/750.out
> > 
> > diff --git a/common/rc b/common/rc
> > index e812a2f7cc67..18ad25662d5c 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -151,6 +151,13 @@ _require_hugepages()
> >  		_notrun "Kernel does not report huge page size"
> >  }
> >  
> > +# Requires CONFIG_COMPACTION
> > +_require_vm_compaction()
> > +{
> > +	if [ ! -f /proc/sys/vm/compact_memory ]; then
> > +	    _notrun "Need compaction enabled CONFIG_COMPACTION=y"
> > +	fi
> > +}
> >  # Get hugepagesize in bytes
> >  _get_hugepagesize()
> >  {
> > diff --git a/tests/generic/750 b/tests/generic/750
> > new file mode 100755
> > index 000000000000..334ab011dfa0
> > --- /dev/null
> > +++ b/tests/generic/750
> > @@ -0,0 +1,62 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2024 Luis Chamberlain.  All Rights Reserved.
> > +#
> > +# FS QA Test 750
> > +#
> > +# fsstress + memory compaction test
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto rw long_rw stress soak smoketest
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -f $runfile
> > +	rm -f $tmp.*
> > +	kill -9 $trigger_compaction_pid > /dev/null 2>&1
> > +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> > +
> > +	wait > /dev/null 2>&1
> > +}
> > +
> > +# Import common functions.
> > +
> > +# real QA test starts here
> > +
> > +_supported_fs generic
> > +
> > +_require_scratch
> > +_require_vm_compaction
> > +_require_command "$KILLALL_PROG" "killall"
> > +
> > +# We still deadlock with this test on v6.10-rc2, we need more work.
> > +# but the below makes things better.
> > +_fixed_by_git_commit kernel d99e3140a4d3 \
> > +	"mm: turn folio_test_hugetlb into a PageType"
> > +
> > +echo "Silence is golden"
> > +
> > +_scratch_mkfs > $seqres.full 2>&1
> > +_scratch_mount >> $seqres.full 2>&1
> > +
> > +nr_cpus=$((LOAD_FACTOR * 4))
> > +nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
> > +fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus)
> > +
> > +# start a background trigger for memory compaction
> > +runfile="$tmp.compaction"
> > +touch $runfile
> > +while [ -e $runfile ]; do
> > +	echo 1 > /proc/sys/vm/compact_memory
> > +	sleep 5
> > +done &
> > +trigger_compaction_pid=$!
> > +
> > +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
> > +
> > +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full
> > +wait > /dev/null 2>&1
> 
> Won't this "wait" wait forever (except a ctrl+C), due to no one removes
> the $runfile?

Odd, pretty sure I tested it and it didn't wait forever, but I'll add
the rm after the FSSTRESS call.

  Luis




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux