On Wed, Jan 10, 2024 at 03:03:24PM +0800, Long Li wrote: > On Wed, Jan 10, 2024 at 08:43:31AM +1100, Dave Chinner wrote: > > The issue is that the code as it stands doesn't handle object > > recovery from multiple checkpoints with the same start lsn. The > > easiest way to understand this is to look at the buffer submit logic > > on completion of a checkpoint: > > > > if (log->l_recovery_lsn != trans->r_lsn && > > ohead->oh_flags & XLOG_COMMIT_TRANS) { > > error = xfs_buf_delwri_submit(buffer_list); > > if (error) > > return error; > > log->l_recovery_lsn = trans->r_lsn; > > } > > > > This submits the buffer list on the first checkpoint that completes > > with a new start LSN, not when all the checkpoints with the same > > start LSN complete. i.e.: > > > > checkpoint start LSN commit lsn submission on commit record > > A 32 63 buffer list for A > > B 64 68 buffer list for B > > C 64 92 nothing, start lsn unchanged > > D 64 127 nothing, start lsn unchanged > > E 128 192 buffer list for C, D and E > > > > I have different understanding about this code. In the first checkpoint's > handle on commit record, buffer_list is empty and l_recovery_lsn update to > the first checkpoint's lsn, the result is that each checkpoint's submit > logic try to submit the buffers which was added to buffer list in checkpoint > recovery of previous LSN. > > xlog_do_recovery_pass > LIST_HEAD (buffer_list); > xlog_recover_process > xlog_recover_process_data > xlog_recover_process_ophdr > xlog_recovery_process_trans > if (log->l_recovery_lsn != trans->r_lsn && > ohead->oh_flags & XLOG_COMMIT_TRANS) { > xfs_buf_delwri_submit(buffer_list); //submit buffer list > log->l_recovery_lsn = trans->r_lsn; > } > xlog_recovery_process_trans > xlog_recover_commit_trans > xlog_recover_items_pass2 > item->ri_ops->commit_pass2 > xlog_recover_buf_commit_pass2 > xfs_buf_delwri_queue(bp, buffer_list) //add bp to buffer list > if (!list_empty(&buffer_list)) > /* submit buffers that was added in checkpoint recovery of last LSN */ > xfs_buf_delwri_submit(&buffer_list) > > So, I think it should be: > > checkpoint start LSN commit lsn submission on commit record > A 32 63 nothing, buffer list is empty > B 64 68 buffer list for A > C 64 92 nothing, start lsn unchanged > D 64 127 nothing, start lsn unchanged > E 128 192 buffer list for B, C and D You are right, I made a mistake in determining the order of buffer list submission vs checkpoint recovery taht builds a given buffer list. Mistakes happen when you only look at complex code once every few years. I will go back and look at the original patch again with this in mind. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx