Re: [PATCH] xfs: ensure submit buffers on LSN boundaries in error handlers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 11, 2024 at 10:08:51AM +1100, Dave Chinner wrote:
> On Wed, Jan 10, 2024 at 03:03:24PM +0800, Long Li wrote:
> > On Wed, Jan 10, 2024 at 08:43:31AM +1100, Dave Chinner wrote:
> > > The issue is that the code as it stands doesn't handle object
> > > recovery from multiple checkpoints with the same start lsn. The
> > > easiest way to understand this is to look at the buffer submit logic
> > > on completion of a checkpoint:
> > > 
> > > 	if (log->l_recovery_lsn != trans->r_lsn &&
> > >             ohead->oh_flags & XLOG_COMMIT_TRANS) {
> > >                 error = xfs_buf_delwri_submit(buffer_list);
> > >                 if (error)
> > >                         return error;
> > >                 log->l_recovery_lsn = trans->r_lsn;
> > >         }
> > > 
> > > This submits the buffer list on the first checkpoint that completes
> > > with a new start LSN, not when all the checkpoints with the same
> > > start LSN complete. i.e.:
> > > 
> > > checkpoint  start LSN	commit lsn	submission on commit record
> > > A		32	  63		buffer list for A
> > > B		64	  68		buffer list for B
> > > C		64	  92		nothing, start lsn unchanged
> > > D		64	 127		nothing, start lsn unchanged
> > > E		128	 192		buffer list for C, D and E
> > > 
> > 
> > I have different understanding about this code. In the first checkpoint's
> > handle on commit record, buffer_list is empty and l_recovery_lsn update to
> > the first checkpoint's lsn, the result is that each checkpoint's submit
> > logic try to submit the buffers which was added to buffer list in checkpoint
> > recovery of previous LSN.
> > 
> >   xlog_do_recovery_pass
> >     LIST_HEAD (buffer_list);
> >     xlog_recover_process
> >       xlog_recover_process_data
> >         xlog_recover_process_ophdr
> >           xlog_recovery_process_trans
> >             if (log->l_recovery_lsn != trans->r_lsn &&
> >                 ohead->oh_flags & XLOG_COMMIT_TRANS) { 
> >               xfs_buf_delwri_submit(buffer_list); //submit buffer list
> >               log->l_recovery_lsn = trans->r_lsn;
> >             }
> >             xlog_recovery_process_trans
> >               xlog_recover_commit_trans
> >                 xlog_recover_items_pass2
> >                   item->ri_ops->commit_pass2
> >                     xlog_recover_buf_commit_pass2
> >                       xfs_buf_delwri_queue(bp, buffer_list) //add bp to buffer list
> >     if (!list_empty(&buffer_list)) 
> >       /* submit buffers that was added in checkpoint recovery of last LSN */
> >       xfs_buf_delwri_submit(&buffer_list)
> > 
> > So, I think it should be:
> >     
> > checkpoint  start LSN	commit lsn	submission on commit record
> > A		32	  63		nothing, buffer list is empty
> > B		64	  68		buffer list for A
> > C		64	  92		nothing, start lsn unchanged
> > D		64	 127		nothing, start lsn unchanged
> > E		128	 192		buffer list for B, C and D

Yeah, I managed to look through some of this code yesterday but hadn't
got back to this. I suspect the disconnect here is that the buffer list
isn't populated until we process the commit record for the associated
log transaction, and that occurs after the check to submit the buffer
list.

So when the commit record of A is seen, the recovery LSN is updated to
32 and the buffer list is still empty, and then is subsequently
populated based on the content of the transaction. From there, not until
we see a commit record for a transaction with a start LSN != 32 is the
buffer list submitted. This is how recovery knows there are no more
metadata lsn == 32 items, and the buffer list may include any number of
checkpoints that start at the current recovery lsn.

In general, this relies on proper commit record ordering which AFAIA
remains a pretty fundamental rule of XFS logging. It seems the confusion
has been resolved in any event, but something that comes to mind as
possibly beneficial to future readers is to perhaps consider turning
this submit check into a small helper with a proper name and any tweaks
that might help clarify the existing comment.

I think Dave's followup comments to invoke shutdown rather than play
games with cancellation make sense, so I'm particularly wondering if we
could just create an xlog_recovery_process_buffers() or some such helper
that could be reused in both places by conditionally doing the
shutdown/submit on error, and then documents everything nicely in once
place. Just a thought, however. It may not be worth it depending on how
ugly it looks.

Brian

> 
> You are right, I made a mistake in determining the order of buffer
> list submission vs checkpoint recovery taht builds a given buffer
> list. Mistakes happen when you only look at complex code once every
> few years. I will go back and look at the original patch again with
> this in mind.
> 
> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux