Re: [PATCH 2/5] xfs: external logs need to flush data device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 22, 2021 at 11:14:45AM -0700, Darrick J. Wong wrote:
> On Thu, Jul 22, 2021 at 11:53:32AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > The recent journal flush/FUA changes replaced the flushing of the
> > data device on every iclog write with an up-front async data device
> > cache flush. Unfortunately, the assumption of which this was based
> > on has been proven incorrect by the flush vs log tail update
> > ordering issue. As the fix for that issue uses the
> > XLOG_ICL_NEED_FLUSH flag to indicate that data device needs a cache
> > flush, we now need to (once again) ensure that an iclog write to
> > external logs that need a cache flush to be issued actually issue a
> > cache flush to the data device as well as the log device.
> > 
> > Fixes: eef983ffeae7 ("xfs: journal IO cache flush reductions")
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> >  fs/xfs/xfs_log.c | 19 +++++++++++--------
> >  1 file changed, 11 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 96434cc4df6e..a3c4d48195d9 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -827,13 +827,6 @@ xlog_write_unmount_record(
> >  	/* account for space used by record data */
> >  	ticket->t_curr_res -= sizeof(ulf);
> >  
> > -	/*
> > -	 * For external log devices, we need to flush the data device cache
> > -	 * first to ensure all metadata writeback is on stable storage before we
> > -	 * stamp the tail LSN into the unmount record.
> > -	 */
> > -	if (log->l_targ != log->l_mp->m_ddev_targp)
> > -		blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev);
> >  	return xlog_write(log, &vec, ticket, NULL, NULL, XLOG_UNMOUNT_TRANS);
> >  }
> >  
> > @@ -1796,10 +1789,20 @@ xlog_write_iclog(
> >  	 * metadata writeback and causing priority inversions.
> >  	 */
> >  	iclog->ic_bio.bi_opf = REQ_OP_WRITE | REQ_META | REQ_SYNC | REQ_IDLE;
> > -	if (iclog->ic_flags & XLOG_ICL_NEED_FLUSH)
> > +	if (iclog->ic_flags & XLOG_ICL_NEED_FLUSH) {
> >  		iclog->ic_bio.bi_opf |= REQ_PREFLUSH;
> > +		/*
> > +		 * For external log devices, we also need to flush the data
> > +		 * device cache first to ensure all metadata writeback covered
> > +		 * by the LSN in this iclog is on stable storage. This is slow,
> > +		 * but it *must* complete before we issue the external log IO.
> 
> I'm a little confused about what's going on here.  We're about to write
> a log record to disk, with h_tail_lsn reflecting the tail of the log and
> h_lsn reflecting the current head of the log (i.e. this record).
> 
> If the log tail has moved forward since the last log record was written
> and this fs has an external log, we need to flush the data device
> because the AIL could have written logged items back into the filesystem
> and we need to ensure those items have been persisted before we write to
> the log the fact that the tail moved forward.  The AIL itself doesn't
> issue cache flushes (nor does it need to), so that's why we do that
> here.
> 
> Why don't we need a flush like this if only FUA is set?  Is it not
> possible to write a checkpoint that fits within a single iclog after the
> log tail has moved forward?

Yes, it is, and that is the race condition is exactly what the next
patch in the series addresses. If the log tail moves after the data
device cache flush was issued before we started writing the
checkpoint to the iclogs, then we detect that when releasing the
commit iclog and set the XLOG_ICL_NEED_FLUSH flag on it. That will
then trigger this code to issue a data device cache flush....

IOWs, for external logs, the XLOG_ICL_NEED_FLUSH flag indicates that
both the data device and the log device need a cache flush, rather
than just the log device. I think it could be split into two flags,
but then my head explodes thinking about log forces and trying to
determine what type of flush is implied (and what flags we'd need to
set) when we return log_flushed = true....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux