Re: [RFC] fstests: add fsstress + writeback + debugfs folio split test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 24, 2024 at 03:46:48PM -0700, Luis Chamberlain wrote:
> Stress test folio splits by using the debugfs interface to a target
> a new smaller folio order. This is dangerous at the moment as its using
> a debugfs API which requires two out of tree fixes [0] [1] which will
> be posted shortly. With these patches applied no crash is possible yet.
> However, this test was designed to try to exacerbate races with folio
> splits and writeback, at least running generic/447 twice ends up
> producing a crash only if large folio splits with minimum folio order
> is enabled.
> 
> With the known fixes for the debugfs interface, this test produces no
> crashes even after 3 hour soaking for 4k and LBS. We should enhance
> this test a bit more so to reproduce the issues observed by running
> generic/447 twice.
> 
> This also begs the question if something like MADV_NOHUGEPAGE might be
> desirable from userspace, so to enable userspace to request splits when
> possible.
> 
> If inspecting more closely, you'll want to enable on your kernel boot:
> 
> 	dyndbg='file mm/huge_memory.c +p'
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=20240424-lbs&id=80f6df5037fd0ad560526af45bd7f4d779fe03f6
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=20240424-lbs&id=38f6fac5b4283ea48b1876fc56728f062168f8c3
> Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> ---
> 
> For those that want to help stress test folio splits to an order, hopefully
> this will help start to enable this. Perhaps there are better ways to create
> more easy targets to stress test folio splits, and in particular try to
> reproduce the issue which so far is only possible by running generic/447 twice
> on LBS. The issue with generic/447 on LBS is not observed on 4k, and this test
> produces no crashes on LBS...
> 
>  common/rc             | 20 +++++++++
>  tests/generic/745     | 97 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/745.out |  2 +
>  3 files changed, 119 insertions(+)
>  create mode 100755 tests/generic/745
>  create mode 100644 tests/generic/745.out
> 
> diff --git a/common/rc b/common/rc
> index d4432f5ce259..1eefb53aa84b 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -127,6 +127,26 @@ _require_compaction()
>  	    _notrun "Need compaction enabled CONFIG_COMPACTION=y"
>  	fi
>  }
> +
> +# Requires CONFIG_DEBUGFS and truncation knobs
> +SPLIT_DEBUGFS="/sys/kernel/debug/split_huge_pages"
> +_require_split_debugfs()
> +{
> +       if [ ! -f $SPLIT_DEBUGFS ]; then
> +           _notrun "Needs CONFIG_DEBUGFS and split_huge_pages"
> +       fi
> +}

The global SPLIT_DEBUGFS isn't necessary, you can just use:

  $DEBUGFS_MNT/split_huge_pages

And this remind me we have a _require_debugfs helper in common/rc, but
it's not used by any one test case. And it looks like not correct:

  _require_debugfs()
  {
      #boot_params always present in debugfs
      [ -d "$DEBUGFS_MNT/boot_params" ] || _notrun "Debugfs not mounted"
  }

I can't find "boot_params" in /sys/kernel/debug/. So we might need to
fix this helper, and then call it in debugfs related test cases.

I'll send another patch to talk about that fix. For this case, it needs
_require_debugfs and $DEBUGFS_MNT.

> +
> +_split_huge_pages_file_full()

May we have a comment to explain what this common helper for? Thanks.

> +{
> +	local file=$1
> +	local offset="0x0"
> +	local len=$(printf "%x" $(stat --format='%s' $file))
> +	local order="0"
> +	local split_cmd="$file,$offset,0x${len},$order"
> +	echo $split_cmd > $SPLIT_DEBUGFS
                           ^^^^^^^^^^^^^
$DEBUGFS_MNT/split_huge_pages is good enough.

> +}
> +
>  # Get hugepagesize in bytes
>  _get_hugepagesize()
>  {
> diff --git a/tests/generic/745 b/tests/generic/745
> new file mode 100755
> index 000000000000..0a30dbee35bd
> --- /dev/null
> +++ b/tests/generic/745
> @@ -0,0 +1,97 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved.
> +#
> +# FS QA Test No. 734
> +#
> +# Run fsstress in a loop, and in the background force some writeback and
> +# folio splits for every file. If you're enabling this and want to check
> +# underneath the hood you may want to enable:
> +#
> +# dyndbg='file mm/huge_memory.c +p'
> +. ./common/preamble
> +_begin_fstest long_rw stress soak smoketest dangerous_fuzzers

Just double check, is it a dangerous_fuzzers test, and not good to be in auto group?

> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> +	cd /
> +	rm -f $tmp.*
> +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> +}
> +
> +# Import common functions.
> +. ./common/filter

I didn't see any filters called in this case, so don't need to
import it.

> +
> +# real QA test starts here
> +_supported_fs generic
> +_require_test

I didn't see TEST_DIR or TEST_DEV in this test case, so don't need
_require_test

> +_require_scratch
> +_require_split_debugfs
> +_require_command "$KILLALL_PROG" "killall"
> +
> +echo "Silence is golden"
> +
> +_scratch_mkfs >>$seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +nr_cpus=$((LOAD_FACTOR * 4))
> +nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
> +
> +fsstress_args=(-w -d $SCRATCH_MNT/test -n $nr_ops -p $nr_cpus)
> +
> +# used to let our loops know when to stop
> +runfile="$tmp.keep.running.loop"
> +touch $runfile
> +
> +# The background ops are out of bounds, the goal is to race with fsstress.
> +
> +# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE
> +# for large folios.
> +while [ -e $runfile ]; do
> +	for i in $(find $SCRATCH_MNT/test \( -type f \) 2>/dev/null); do

May I ask why the "\( .. \)" is needed?

> +		_split_huge_pages_file_full $i >/dev/null 2>&1

Just make sure, don't you want to output to $seqres.full file to get more information
for debug, if something wrong.

> +	done
> +	sleep 2
> +done &
> +split_huge_pages_files_pid=$!

This split_huge_pages_files_pid isn't be used anywhere, you can deal with it in
_cleanup.

> +
> +blocksize=$(_get_file_block_size $SCRATCH_MNT)
> +export XFS_DIO_MIN=$((blocksize * 2))

Oh, I even forgot we have this parameter in fsstress.c :)

> +
> +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
> +
> +# Our general goal here is to race with ops which can stress folio addition,
> +# removal, edits, and writeback.
> +
> +# zero frequencies for write ops to minimize writeback
> +fsstress_args+=(-R)

But you set "fsstress_args=(-w -d $SCRATCH_MNT/test -n $nr_ops -p $nr_cpus)" above,
so you zeros frequencies of non-write operations (-w) and then zeros frequencies of
write operations (-R). Do you want to use "-z" directly, to zeros frequencies of
all operations ?

> +
> +# XXX: we can improve this, so to increase the chances to allow more
> +# folio splits. Also running generic/447 twice triggers a corner case we can't
> +# capture here on folio splits and write_cache_pages, increasing the chances of
> +# this test to cover that same corner case would be ideal.
> +#
> +# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df
> +fsstress_args+=(-f creat=1)
> +fsstress_args+=(-f write=1)
> +fsstress_args+=(-f dwrite=1)
> +fsstress_args+=(-f truncate=1)
> +fsstress_args+=(-f zero=1)
> +fsstress_args+=(-f unlink=1)
> +fsstress_args+=(-f fsync=1)
> +fsstress_args+=(-f punch=2)
> +fsstress_args+=(-f copyrange=4)
> +fsstress_args+=(-f clonerange=4)
> +
> +if [[ "$FSTYP" != "xfs" || "$FSTYP" == "ext4" ]]; then
> +	fsstress_args+=(-f collapse=1)

Can you explain why the FALLOC_FL_COLLAPSE_RANGE is special, and not suitable
for xfs?

And I think the range of $FSTYP" != "xfs" contains "$FSTYP" == "ext4", so
this logic makes me confused.

> +fi
> +
> +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full
> +
> +rm -f $runfile
> +wait > /dev/null 2>&1

Better to move this to _cleanup. e.g.

rm -f $runfile
[ -n "$split_huge_pages_files_pid" ] && kill $split_huge_pages_files_pid
$KILLALL_PROG -9 fsstress
wait


Thanks,
Zorro
> +
> +status=0
> +exit
> diff --git a/tests/generic/745.out b/tests/generic/745.out
> new file mode 100644
> index 000000000000..fce6b7f5489d
> --- /dev/null
> +++ b/tests/generic/745.out
> @@ -0,0 +1,2 @@
> +QA output created by 745
> +Silence is golden
> -- 
> 2.43.0
> 
> 





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux