On Tue, Sep 10, 2024 at 07:28:46AM +0300, Christoph Hellwig wrote: > An unclean log can contain both the transaction that created a new > allocation group and the first transaction that is freeing space from > it, in which case the extent free item recovery requires the perag > structure to be present. > > Currently the perag structures are only created after log recovery > has completed, leading a warning and file system shutdown for the > above case. I'm missing something - the intents aren't processed until the log has been recovered - queuing an intent to be processed does not require the per-ag to be present. We don't take per-ag references until we are recovering the intent. i.e. we've completed journal recovery and haven't found the corresponding EFD. That leaves the EFI in the log->r_dfops, and we then run ->recover_work in the second phase of recovery. It is xfs_extent_free_recover_work() that creates the new transaction and runs the EFI processing that requires the perag references, isn't it? IOWs, I don't see where the initial EFI/EFD recovery during the checkpoint processing requires the newly created perags to be present in memory for processing incomplete EFIs before the journal recovery phase has completed. > > Fix this by creating new perag structures and updating > the in-memory superblock fields as soon a buffer log item that covers > the primary super block is recovered. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- > fs/xfs/libxfs/xfs_log_recover.h | 2 ++ > fs/xfs/xfs_buf_item_recover.c | 16 +++++++++ > fs/xfs/xfs_log_recover.c | 59 ++++++++++++++------------------- > 3 files changed, 43 insertions(+), 34 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h > index 521d327e4c89ed..d0e13c84422d0a 100644 > --- a/fs/xfs/libxfs/xfs_log_recover.h > +++ b/fs/xfs/libxfs/xfs_log_recover.h > @@ -165,4 +165,6 @@ void xlog_recover_intent_item(struct xlog *log, struct xfs_log_item *lip, > int xlog_recover_finish_intent(struct xfs_trans *tp, > struct xfs_defer_pending *dfp); > > +int xlog_recover_update_agcount(struct xfs_mount *mp, struct xfs_dsb *dsb); > + > #endif /* __XFS_LOG_RECOVER_H__ */ > diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c > index 09e893cf563cb9..033821a56b6ac6 100644 > --- a/fs/xfs/xfs_buf_item_recover.c > +++ b/fs/xfs/xfs_buf_item_recover.c > @@ -969,6 +969,22 @@ xlog_recover_buf_commit_pass2( > goto out_release; > } else { > xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn); > + > + /* > + * Update the in-memory superblock and perag structures from the > + * primary SB buffer. > + * > + * This is required because transactions running after growf > + * s may require in-memory structures like the perag right after > + * committing the growfs transaction that created the underlying > + * objects. > + */ > + if ((xfs_blft_from_flags(buf_f) & XFS_BLFT_SB_BUF) && > + xfs_buf_daddr(bp) == 0) { > + error = xlog_recover_update_agcount(mp, bp->b_addr); > + if (error) > + goto out_release; > + } > } If we are going to keep this logic, can you do this as a separate helper function? i.e.: if (inode buffer) { xlog_recover_do_inode_buffer(); } else if (dquot buffer) { xlog_recover_do_dquot_buffer(); } else if (superblock buffer) { xlog_recover_do_sb_buffer(); } else { xlog_recover_do_reg_buffer(); } and xlog_recover_do_sb_buffer() { error = xlog_recover_do_reg_buffer() if (error || xfs_buf_daddr(bp) != XFS_SB_ADDR) return error; return xlog_recover_update_agcount(); } > > /* > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index 2af02b32f419c2..7d7ab146cae758 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -3334,6 +3334,30 @@ xlog_do_log_recovery( > return error; > } > > +int > +xlog_recover_update_agcount( > + struct xfs_mount *mp, > + struct xfs_dsb *dsb) > +{ > + xfs_agnumber_t old_agcount = mp->m_sb.sb_agcount; > + int error; > + > + xfs_sb_from_disk(&mp->m_sb, dsb); > + if (mp->m_sb.sb_agcount < old_agcount) { > + xfs_alert(mp, "Shrinking AG count in log recovery"); > + return -EFSCORRUPTED; > + } > + mp->m_features |= xfs_sb_version_to_features(&mp->m_sb); I'm not sure this is safe. The item order in the checkpoint recovery isn't guaranteed to be exactly the same as when feature bits are modified at runtime. Hence there could be items in the checkpoint that haven't yet been recovered that are dependent on the original sb feature mask being present. It may be OK to do this at the end of the checkpoint being recovered. I'm also not sure why this feature update code is being changed because it's not mentioned at all in the commit message. > + error = xfs_initialize_perag(mp, old_agcount, mp->m_sb.sb_agcount, > + mp->m_sb.sb_dblocks, &mp->m_maxagi); Why do this if sb_agcount has not changed? AFAICT it only iterates the AGs already initialised and so skips them, then recalculates inode32 and prealloc block parameters, which won't change. Hence it's a total no-op for anything other than an actual ag count change and should be skipped, right? -Dave. -- Dave Chinner david@xxxxxxxxxxxxx