On Thu, Oct 27, 2022 at 12:57:47AM +0800, Zorro Lang wrote: > There was a known xfs crash bug fixed by e001873853d8 ("xfs: ensure > we capture IO errors correctly"), so trys to cover this bug and make > sure xfs can capture IO errors correctly, won't panic and hang again. > > Signed-off-by: Zorro Lang <zlang@xxxxxxxxxx> > --- > > Hi, > > When I tried to tidy up our internal test cases recently, I found a very > old case which trys to cover e001873853d8 ("xfs: ensure we capture IO errors > correctly") which fix by Dave. At that time, we didn't support xfs injection, > so we tested it by a systemtap script [1] to inject an ioerror. > > Now this bug has been fixed long long time ago (9+ years), and that stap script > is already out of date, can't work with new kernel. But good news is we have xfs > injection now, so I try to resume this test case in fstests. > > I didn't verify if this case can reproduce that bug on old rhel (which doesn't > support error injection). The original case tried to inject errno 11, I'm > not sure if it's worth trying more other errors. I searched "buf_ioerror" in > fstests, found nothing. So maybe this bug is old enough, but it's worth covering > this kind of test. So feel free to tell me if you have any suggestions :) > > Thanks, > Zorro > > [1] > probe module("xfs").function("xfs_buf_bio_end_io") > { > if ($error == 0) { > if ($bio->bi_rw & (1 << 4)) { > $error = -11; > printf("%s: comm %s, pid %d, setting error 11\n", > probefunc(), execname(), pid()); > print_stack(backtrace()); > } > } > } > > tests/xfs/554 | 53 +++++++++++++++++++++++++++++++++++++++++++++++ > tests/xfs/554.out | 4 ++++ > 2 files changed, 57 insertions(+) > create mode 100755 tests/xfs/554 > create mode 100644 tests/xfs/554.out > > diff --git a/tests/xfs/554 b/tests/xfs/554 > new file mode 100755 > index 00000000..6935bfc0 > --- /dev/null > +++ b/tests/xfs/554 > @@ -0,0 +1,53 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2022 YOUR NAME HERE. All Rights Reserved. Mr. YOUR HERE, Please write your real name in the copyright statement. > +# > +# FS QA Test 554 > +# > +# There was a known xfs crash bug fixed by e001873853d8 ("xfs: ensure we > +# capture IO errors correctly"), so trys to cover this bug and make sure > +# xfs can capture IO errors correctly, won't panic and hang again. > +# > +. ./common/preamble > +_begin_fstest auto eio > + > +_cleanup() > +{ > + $KILLALL_PROG -q fsstress 2> /dev/null > + # ensures all fsstress processes died > + wait > + # log replay, due to the buf_ioerror injection might leave dirty log > + _scratch_cycle_mount > + cd / > + rm -r -f $tmp.* > +} > + > +# Import common functions. > +. ./common/inject > + > +# real QA test starts here > +_supported_fs xfs > +_require_command "$KILLALL_PROG" "killall" > +_require_scratch > +_require_xfs_debug > +_require_xfs_io_error_injection "buf_ioerror" > + > +_scratch_mkfs >> $seqres.full > +_scratch_mount > + > +echo "Inject buf ioerror tag" > +_scratch_inject_error buf_ioerror 11 > + > +echo "Random I/Os testing ..." > +$FSSTRESS_PROG $FSSTRESS_AVOID -d $SCRATCH_MNT -n 50000 -p 100 >> $seqres.full & > +for ((i=0; i<5; i++));do > + # Clear caches, then trys to use 'find' to trigger readahead BUF_IOERROR only seems to apply to async writes: static void xfs_buf_bio_end_io( struct bio *bio) { struct xfs_buf *bp = (struct xfs_buf *)bio->bi_private; if (!bio->bi_status && (bp->b_flags & XBF_WRITE) && (bp->b_flags & XBF_ASYNC) && XFS_TEST_ERROR(false, bp->b_mount, XFS_ERRTAG_BUF_IOERROR)) bio->bi_status = BLK_STS_IOERR; So I don't see how this would reproduce the problem of b_error not being cleared after a failed readahead and re-read? --D > + echo 3 > /proc/sys/vm/drop_caches > + find $SCRATCH_MNT >/dev/null 2>&1 > + sleep 3 > +done > + > +echo "No hang or panic" > +# success, all done > +status=0 > +exit > diff --git a/tests/xfs/554.out b/tests/xfs/554.out > new file mode 100644 > index 00000000..26910daa > --- /dev/null > +++ b/tests/xfs/554.out > @@ -0,0 +1,4 @@ > +QA output created by 554 > +Inject buf ioerror tag > +Random I/Os testing ... > +No hang or panic > -- > 2.31.1 >