Re: [PATCH] xfs: up(ic_sema) if flushing data device fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 23, 2023 at 02:22:21PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 23, 2023 at 11:14:10AM -0700, Leah Rumancik wrote:
> > We flush the data device cache before we issue external log IO. Since
> > 7d839e325af2, we check the return value of the flush, and if the flush
> > failed, we shut down the log immediately and return. However, the
> > iclog->ic_sema is left in a decremented state so let's add an up().
> > Prior to this patch, xfs/438 would fail consistently when running with
> > an external log device:
> > 
> > sync
> >   -> xfs_log_force
> >   -> xlog_write_iclog
> >       -> down(&iclog->ic_sema)
> >       -> blkdev_issue_flush (fail causes us to intiate shutdown)
> >           -> xlog_force_shutdown
> >           -> return
> > 
> > unmount
> >   -> xfs_log_umount
> >       -> xlog_wait_iclog_completion
> >           -> down(&iclog->ic_sema) --------> HANG
> > 
> > There is a second early return / shutdown. Add an up() there as well.
> 
> Ow.  Yes, I think it's correct that both of those error returns need to
> drop ic_sema since we don't submit_bio, so there is no xlog_ioend_work
> to do it for us.
> 
> > Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
> 
> Hmm.  This bug was introduced in b5d721eaae47e ("xfs: external logs need
> to flush data device"), not 7d839.  That said, this patch only applies
> cleanly to 7d839e325af2.
> 
> b5d721 was introduced in 5.14 and 7d839 came in via 6.0, so ... this is
> where I would have spat out:
> 
> Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
> Actually-Fixes: b5d721eaae47e ("xfs: external logs need to flush data device")

I'm not sure I follow. Before 7d839e325af2, we didn't return if there
was an issue with the flush so there wasn't an issue with ic_sema. 7d839e325af2
was a fix for eef983ffeae7 though. Do you try to keep fixes tags
associated with the original commit as opposed to fixes of fixes?

> 
> > Signed-off-by: Leah Rumancik <leah.rumancik@xxxxxxxxx>
> > ---
> > 
> > Notes:
> >     Tested auto group for xfs/4k and xfs/logdev configs with no regressions
> >     seen.
> > 
> >  fs/xfs/xfs_log.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 51c100c86177..b4a8105299c2 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -1926,6 +1926,7 @@ xlog_write_iclog(
> >  		 */
> >  		if (log->l_targ != log->l_mp->m_ddev_targp &&
> >  		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
> > +			up(&iclog->ic_sema);
> >  			xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> >  			return;
> >  		}
> > @@ -1936,6 +1937,7 @@ xlog_write_iclog(
> >  	iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
> >  
> >  	if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) {
> > +		up(&iclog->ic_sema);
> >  		xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> 
> I wonder if these two should both become a cleanup clause at the end?

Sure, that sounds good, I'll create a new version.

Thanks for the review! :)
- Leah
> 
> 		if (log->l_targ != log->l_mp->m_ddev_targp &&
> 		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev))
> 			goto shutdown;
> 
> ...
> 
> 	if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count))
> 		goto shutdown;
> 
> ...
> 
> 	submit_bio(&iclog->ic_bio);
> 	return;
> 
> shutdown:
> 	up(&iclog->ic_sema);
> 	xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
> 
> Seeing as we've screwed this up twice now and not a whole lot of people
> actually use external logs, and somehow I've never seen this on my test
> fleet.
> 
> Anyway the code change looks correct so modulo the stylistic thing,
> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> --D
> 
> >  		return;
> >  	}
> > -- 
> > 2.42.0.758.gaed0368e0e-goog
> > 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux