The log quiesce mechanism historically terminates by marking the log clean with an unmount record. The primary objective is to indicate that log recovery is no longer required after the quiesce has flushed all in-core changes and written back filesystem metadata. While this is perfectly fine, it is somewhat hacky as currently used in certain contexts. For example, filesystem freeze quiesces (i.e. cleans) the log and immediately redirties it with a dummy superblock transaction to ensure that log recovery runs in the event of a crash. While this functions correctly, cleaning the log from freeze context is clearly superfluous given the current redirtying behavior. Instead, the desired behavior can be achieved by simply covering the log. This effectively retires all on-disk log items from the active range of the log by issuing two synchronous and sequential dummy superblock update transactions that serve to update the on-disk log head and tail. The subtle difference is that the log technically remains dirty due to the lack of an unmount record, though recovery is effectively a no-op due to the content of the checkpoints being clean (i.e. the unmodified on-disk superblock). Log covering currently runs in the background and only triggers once the filesystem and log has idled. The purpose of the background mechanism is to prevent log recovery from replaying the most recently logged items long after those items may have been written back. In the quiesce path, the log has been deliberately idled by forcing the log and pushing the AIL until empty in a context where no further mutable filesystem operations are allowed. Therefore, we can cover the log as the final step in the log quiesce codepath to reflect that all previously active items have been successfully written back. This facilitates selective log covering from certain contexts (i.e. freeze) that only seek to quiesce, but not necessarily clean the log. Note that as a side effect of this change, log covering now occurs when cleaning the log as well. This is harmless, facilitates subsequent cleanups, and is mostly temporary as various operations switch to use explicit log covering. Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> --- fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++--- fs/xfs/xfs_log.h | 2 +- 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index 0fb26b05edc9..7c31b046e790 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -91,6 +91,9 @@ STATIC int xlog_iclogs_empty( struct xlog *log); +static int +xfs_log_cover(struct xfs_mount *); + static void xlog_grant_sub_space( struct xlog *log, @@ -936,10 +939,9 @@ xfs_log_unmount_write( * To do this, we first need to shut down the background log work so it is not * trying to cover the log as we clean up. We then need to unpin all objects in * the log so we can then flush them out. Once they have completed their IO and - * run the callbacks removing themselves from the AIL, we can write the unmount - * record. + * run the callbacks removing themselves from the AIL, we can cover the log. */ -void +int xfs_log_quiesce( struct xfs_mount *mp) { @@ -957,6 +959,8 @@ xfs_log_quiesce( xfs_wait_buftarg(mp->m_ddev_targp); xfs_buf_lock(mp->m_sb_bp); xfs_buf_unlock(mp->m_sb_bp); + + return xfs_log_cover(mp); } void @@ -1092,6 +1096,45 @@ xfs_log_need_covered( return needed; } +/* + * Explicitly cover the log. This is similar to background log covering but + * intended for usage in quiesce codepaths. The caller is responsible to ensure + * the log is idle and suitable for covering. The CIL, iclog buffers and AIL + * must all be empty. + */ +static int +xfs_log_cover( + struct xfs_mount *mp) +{ + struct xlog *log = mp->m_log; + int error = 0; + + ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) && + !xfs_ail_min_lsn(log->l_ailp)) || + XFS_FORCED_SHUTDOWN(mp)); + + if (!xfs_log_writable(mp)) + return 0; + + /* + * To cover the log, commit the superblock twice (at most) in + * independent checkpoints. The first serves as a reference for the + * tail pointer. The sync transaction and AIL push empties the AIL and + * updates the in-core tail to the LSN of the first checkpoint. The + * second commit updates the on-disk tail with the in-core LSN, + * covering the log. Push the AIL one more time to leave it empty, as + * we found it. + */ + while (xfs_log_need_covered(mp)) { + error = xfs_sync_sb(mp, true); + if (error) + break; + xfs_ail_push_all_sync(mp->m_ail); + } + + return error; +} + /* * We may be holding the log iclog lock upon entering this routine. */ diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h index b0400589f824..044e02cb8921 100644 --- a/fs/xfs/xfs_log.h +++ b/fs/xfs/xfs_log.h @@ -138,7 +138,7 @@ void xlog_cil_process_committed(struct list_head *list); bool xfs_log_item_in_current_chkpt(struct xfs_log_item *lip); void xfs_log_work_queue(struct xfs_mount *mp); -void xfs_log_quiesce(struct xfs_mount *mp); +int xfs_log_quiesce(struct xfs_mount *mp); void xfs_log_clean(struct xfs_mount *mp); bool xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t); bool xfs_log_in_recovery(struct xfs_mount *); -- 2.26.2