On Wed, Apr 29, 2020 at 01:21:41PM -0400, Brian Foster wrote: > The buffer write failure flag is intended to control the internal > write retry that XFS has historically implemented to help mitigate > the severity of transient I/O errors. The flag is set when a buffer > is resubmitted from the I/O completion path due to a previous > failure. It is checked on subsequent I/O completions to skip the > internal retry and fall through to the higher level configurable > error handling mechanism. The flag is cleared in the synchronous and > delwri submission paths and also checked in various places to log > write failure messages. > > There are a couple minor problems with the current usage of this > flag. One is that we issue an internal retry after every submission > from xfsaild due to how delwri submission clears the flag. This > results in double the expected or configured number of write > attempts when under sustained failures. Another more subtle issue is > that the flag is never cleared on successful I/O completion. This > can cause xfs_wait_buftarg() to suggest that dirty buffers are being > thrown away due to the existence of the flag, when the reality is > that the flag might still be set because the write succeeded on the > retry. > > Clear the write failure flag on successful I/O completion to address > both of these problems. This means that the internal retry attempt > occurs once since the last time a buffer write failed and that > various other contexts only see the flag set when the immediately > previous write attempt has failed. > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> Makes sense, and probably explains why the ioerr retry timeouts sometimes took longer than I was expecting them to... Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> --D > --- > fs/xfs/xfs_buf.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index d5d6a68bb1e6..fd76a84cefdd 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -1197,8 +1197,10 @@ xfs_buf_ioend( > bp->b_ops->verify_read(bp); > } > > - if (!bp->b_error) > + if (!bp->b_error) { > + bp->b_flags &= ~XBF_WRITE_FAIL; > bp->b_flags |= XBF_DONE; > + } > > if (bp->b_iodone) > (*(bp->b_iodone))(bp); > @@ -1274,7 +1276,7 @@ xfs_bwrite( > > bp->b_flags |= XBF_WRITE; > bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q | > - XBF_WRITE_FAIL | XBF_DONE); > + XBF_DONE); > > error = xfs_buf_submit(bp); > if (error) > @@ -1996,7 +1998,7 @@ xfs_buf_delwri_submit_buffers( > * synchronously. Otherwise, drop the buffer from the delwri > * queue and submit async. > */ > - bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL); > + bp->b_flags &= ~_XBF_DELWRI_Q; > bp->b_flags |= XBF_WRITE; > if (wait_list) { > bp->b_flags &= ~XBF_ASYNC; > -- > 2.21.1 >