On Thu, Jan 21, 2021 at 10:45:21AM -0500, Brian Foster wrote: > The log quiesce mechanism historically terminates by marking the log > clean with an unmount record. The primary objective is to indicate > that log recovery is no longer required after the quiesce has > flushed all in-core changes and written back filesystem metadata. > While this is perfectly fine, it is somewhat hacky as currently used > in certain contexts. For example, filesystem freeze quiesces (i.e. > cleans) the log and immediately redirties it with a dummy superblock > transaction to ensure that log recovery runs in the event of a > crash. > > While this functions correctly, cleaning the log from freeze context > is clearly superfluous given the current redirtying behavior. > Instead, the desired behavior can be achieved by simply covering the > log. This effectively retires all on-disk log items from the active > range of the log by issuing two synchronous and sequential dummy > superblock update transactions that serve to update the on-disk log > head and tail. The subtle difference is that the log technically > remains dirty due to the lack of an unmount record, though recovery > is effectively a no-op due to the content of the checkpoints being > clean (i.e. the unmodified on-disk superblock). > > Log covering currently runs in the background and only triggers once > the filesystem and log has idled. The purpose of the background > mechanism is to prevent log recovery from replaying the most > recently logged items long after those items may have been written > back. In the quiesce path, the log has been deliberately idled by > forcing the log and pushing the AIL until empty in a context where > no further mutable filesystem operations are allowed. Therefore, we > can cover the log as the final step in the log quiesce codepath to > reflect that all previously active items have been successfully > written back. > > This facilitates selective log covering from certain contexts (i.e. > freeze) that only seek to quiesce, but not necessarily clean the > log. Note that as a side effect of this change, log covering now > occurs when cleaning the log as well. This is harmless, facilitates > subsequent cleanups, and is mostly temporary as various operations > switch to use explicit log covering. > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > Reviewed-by: Christoph Hellwig <hch@xxxxxx> Looks good, Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > --- > fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++--- > fs/xfs/xfs_log.h | 2 +- > 2 files changed, 47 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > index 0fb26b05edc9..7c31b046e790 100644 > --- a/fs/xfs/xfs_log.c > +++ b/fs/xfs/xfs_log.c > @@ -91,6 +91,9 @@ STATIC int > xlog_iclogs_empty( > struct xlog *log); > > +static int > +xfs_log_cover(struct xfs_mount *); > + > static void > xlog_grant_sub_space( > struct xlog *log, > @@ -936,10 +939,9 @@ xfs_log_unmount_write( > * To do this, we first need to shut down the background log work so it is not > * trying to cover the log as we clean up. We then need to unpin all objects in > * the log so we can then flush them out. Once they have completed their IO and > - * run the callbacks removing themselves from the AIL, we can write the unmount > - * record. > + * run the callbacks removing themselves from the AIL, we can cover the log. > */ > -void > +int > xfs_log_quiesce( > struct xfs_mount *mp) > { > @@ -957,6 +959,8 @@ xfs_log_quiesce( > xfs_wait_buftarg(mp->m_ddev_targp); > xfs_buf_lock(mp->m_sb_bp); > xfs_buf_unlock(mp->m_sb_bp); > + > + return xfs_log_cover(mp); > } > > void > @@ -1092,6 +1096,45 @@ xfs_log_need_covered( > return needed; > } > > +/* > + * Explicitly cover the log. This is similar to background log covering but > + * intended for usage in quiesce codepaths. The caller is responsible to ensure > + * the log is idle and suitable for covering. The CIL, iclog buffers and AIL > + * must all be empty. > + */ > +static int > +xfs_log_cover( > + struct xfs_mount *mp) > +{ > + struct xlog *log = mp->m_log; > + int error = 0; > + > + ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) && > + !xfs_ail_min_lsn(log->l_ailp)) || > + XFS_FORCED_SHUTDOWN(mp)); > + > + if (!xfs_log_writable(mp)) > + return 0; > + > + /* > + * To cover the log, commit the superblock twice (at most) in > + * independent checkpoints. The first serves as a reference for the > + * tail pointer. The sync transaction and AIL push empties the AIL and > + * updates the in-core tail to the LSN of the first checkpoint. The > + * second commit updates the on-disk tail with the in-core LSN, > + * covering the log. Push the AIL one more time to leave it empty, as > + * we found it. > + */ > + while (xfs_log_need_covered(mp)) { > + error = xfs_sync_sb(mp, true); > + if (error) > + break; > + xfs_ail_push_all_sync(mp->m_ail); > + } > + > + return error; > +} > + > /* > * We may be holding the log iclog lock upon entering this routine. > */ > diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h > index b0400589f824..044e02cb8921 100644 > --- a/fs/xfs/xfs_log.h > +++ b/fs/xfs/xfs_log.h > @@ -138,7 +138,7 @@ void xlog_cil_process_committed(struct list_head *list); > bool xfs_log_item_in_current_chkpt(struct xfs_log_item *lip); > > void xfs_log_work_queue(struct xfs_mount *mp); > -void xfs_log_quiesce(struct xfs_mount *mp); > +int xfs_log_quiesce(struct xfs_mount *mp); > void xfs_log_clean(struct xfs_mount *mp); > bool xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t); > bool xfs_log_in_recovery(struct xfs_mount *); > -- > 2.26.2 >