On Wed, Apr 29, 2020 at 01:21:43PM -0400, Brian Foster wrote: > At unmount time, XFS emits an alert for every in-core buffer that > might have undergone a write error. In practice this behavior is > probably reasonable given that the filesystem is likely short lived > once I/O errors begin to occur consistently. Under certain test or > otherwise expected error conditions, this can spam the logs and slow > down the unmount. > > Now that we have a ratelimit mechanism specifically for buffer > alerts, reuse it for the per-buffer alerts in xfs_wait_buftarg(). > Also lift the final repair message out of the loop so it always > prints and assert that the metadata error handling code has shut > down the fs. > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > --- > fs/xfs/xfs_buf.c | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index 594d5e1df6f8..8f0f605de579 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -1657,7 +1657,8 @@ xfs_wait_buftarg( > struct xfs_buftarg *btp) > { > LIST_HEAD(dispose); > - int loop = 0; > + int loop = 0; > + bool write_fail = false; > > /* > * First wait on the buftarg I/O count for all in-flight buffers to be > @@ -1685,17 +1686,23 @@ xfs_wait_buftarg( > bp = list_first_entry(&dispose, struct xfs_buf, b_lru); > list_del_init(&bp->b_lru); > if (bp->b_flags & XBF_WRITE_FAIL) { > - xfs_alert(btp->bt_mount, > + write_fail = true; > + xfs_buf_alert_ratelimited(bp, > + "XFS: Corruption Alert", > "Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!", > (long long)bp->b_bn); > - xfs_alert(btp->bt_mount, > -"Please run xfs_repair to determine the extent of the problem."); > } > xfs_buf_rele(bp); > } > if (loop++ != 0) > delay(100); > } > + > + if (write_fail) { > + ASSERT(XFS_FORCED_SHUTDOWN(btp->bt_mount)); I think this is incorrect. A metadata write that is set to retry forever and is failing because of a bad sector or some other persistent device error will not shut down the filesystem, but still be reported here as a failure. Hence we can easily get here without a filesystem shutdown having occurred... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx