Re: [PATCH] xfs: ensure submit buffers on LSN boundaries in error handlers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 10, 2024 at 03:03:24PM +0800, Long Li wrote:
> On Wed, Jan 10, 2024 at 08:43:31AM +1100, Dave Chinner wrote:
> > The issue is that the code as it stands doesn't handle object
> > recovery from multiple checkpoints with the same start lsn. The
> > easiest way to understand this is to look at the buffer submit logic
> > on completion of a checkpoint:
> > 
> > 	if (log->l_recovery_lsn != trans->r_lsn &&
> >             ohead->oh_flags & XLOG_COMMIT_TRANS) {
> >                 error = xfs_buf_delwri_submit(buffer_list);
> >                 if (error)
> >                         return error;
> >                 log->l_recovery_lsn = trans->r_lsn;
> >         }
> > 
> > This submits the buffer list on the first checkpoint that completes
> > with a new start LSN, not when all the checkpoints with the same
> > start LSN complete. i.e.:
> > 
> > checkpoint  start LSN	commit lsn	submission on commit record
> > A		32	  63		buffer list for A
> > B		64	  68		buffer list for B
> > C		64	  92		nothing, start lsn unchanged
> > D		64	 127		nothing, start lsn unchanged
> > E		128	 192		buffer list for C, D and E
> > 
> 
> I have different understanding about this code. In the first checkpoint's
> handle on commit record, buffer_list is empty and l_recovery_lsn update to
> the first checkpoint's lsn, the result is that each checkpoint's submit
> logic try to submit the buffers which was added to buffer list in checkpoint
> recovery of previous LSN.
> 
>   xlog_do_recovery_pass
>     LIST_HEAD (buffer_list);
>     xlog_recover_process
>       xlog_recover_process_data
>         xlog_recover_process_ophdr
>           xlog_recovery_process_trans
>             if (log->l_recovery_lsn != trans->r_lsn &&
>                 ohead->oh_flags & XLOG_COMMIT_TRANS) { 
>               xfs_buf_delwri_submit(buffer_list); //submit buffer list
>               log->l_recovery_lsn = trans->r_lsn;
>             }
>             xlog_recovery_process_trans
>               xlog_recover_commit_trans
>                 xlog_recover_items_pass2
>                   item->ri_ops->commit_pass2
>                     xlog_recover_buf_commit_pass2
>                       xfs_buf_delwri_queue(bp, buffer_list) //add bp to buffer list
>     if (!list_empty(&buffer_list)) 
>       /* submit buffers that was added in checkpoint recovery of last LSN */
>       xfs_buf_delwri_submit(&buffer_list)
> 
> So, I think it should be:
>     
> checkpoint  start LSN	commit lsn	submission on commit record
> A		32	  63		nothing, buffer list is empty
> B		64	  68		buffer list for A
> C		64	  92		nothing, start lsn unchanged
> D		64	 127		nothing, start lsn unchanged
> E		128	 192		buffer list for B, C and D

You are right, I made a mistake in determining the order of buffer
list submission vs checkpoint recovery taht builds a given buffer
list. Mistakes happen when you only look at complex code once every
few years. I will go back and look at the original patch again with
this in mind.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux