Re: [PATCH 2/8] xfs/155: fail the test if xfs_repair hangs for too long

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Mon, Feb 26, 2024 at 06:01:03PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> There are a few hard to reproduce bugs in xfs_repair where it can
> deadlock trying to lock a buffer that it already owns.  These stalls
> cause fstests never to finish, which is annoying!  To fix this, set up
> the xfs_repair run to abort after 10 minutes, which will affect the
> golden output and capture a core file.
> 
> This doesn't fix xfs_repair, obviously.
> 
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  tests/xfs/155 |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> 
> diff --git a/tests/xfs/155 b/tests/xfs/155
> index 302607b510..fba557bff6 100755
> --- a/tests/xfs/155
> +++ b/tests/xfs/155
> @@ -27,6 +27,10 @@ _require_scratch_xfs_crc		# needsrepair only exists for v5
>  _require_populate_commands
>  _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH
>  
> +# Inject a 10 minute abortive timeout on the repair program so that deadlocks
> +# in the program do not cause fstests to hang indefinitely.
> +XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG"

Others cases of fstests always do:
  _require_command "$TIMEOUT_PROG" timeout
before using timeout.

Others looks good to me, as you only change single one case, it won't affect other testing.
Just hope the 10 minutes is enough even if on a big storage :)

Thanks,
Zorro

> +
>  # Populate the filesystem
>  _scratch_populate_cached nofill >> $seqres.full 2>&1
>  
> 





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux