Re: [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Thu, 7 Jan 2021 11:30:50 -0800

On Wed, Jan 06, 2021 at 12:41:23PM -0500, Brian Foster wrote:
> Now that log covering occurs on quiesce, we'd like to reuse the
> underlying superblock sync for final superblock updates. This
> includes things like lazy superblock counter updates, log feature
> incompat bits in the future, etc. One quirk to this approach is that
> once the log is in the IDLE (i.e. already covered) state, any
> subsequent log write resets the state back to NEED. This means that
> a final superblock sync to an already covered log requires two more
> sb syncs to return the log back to IDLE again.
> 
> For example, if a lazy superblock enabled filesystem is mount cycled
> without any modifications, the unmount path syncs the superblock
> once and writes an unmount record. With the desired log quiesce
> covering behavior, we sync the superblock three times at unmount
> time: once for the lazy superblock counter update and twice more to
> cover the log. By contrast, if the log is active or only partially
> covered at unmount time, a final superblock sync would doubly serve
> as the one or two remaining syncs required to cover the log.
> 
> This duplicate covering sequence is unnecessary because the
> filesystem remains consistent if a crash occurs at any point. The
> superblock will either be recovered in the event of a crash or
> written back before the log is quiesced and potentially cleaned with
> an unmount record.
> 
> Update the log covering state machine to remain in the IDLE state if
> additional covering checkpoints pass through the log. This
> facilitates final superblock updates (such as lazy superblock
> counters) via a single sb sync without losing covered status. This
> provides some consistency with the active and partially covered
> cases and also avoids harmless, but spurious checkpoints when
> quiescing the log.
> 
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>

I /think/ the premise of this is ok.

I found myself wondering if xlog_state_activate_iclog could mistake an
iclog containing 5 log ops from a real update for an iclog containing
just the dummy transaction, since a synchronous inode mtime update
transaction can also produce an iclog with 5 ops.  I /think/ that
doesn't matter because xlog_covered_state only cares about the value of
iclogs_changed if the log worker previously set the log state to DONE or
DONE2, and iclogs_changed won't be 1 here if there were multiple dirty
iclogs or if the sole dirty iclog contains more than just the dummy.

If I got that right,
Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

--D

> ---
>  fs/xfs/xfs_log.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f7b23044723d..9b8564f280b7 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2597,12 +2597,15 @@ xlog_covered_state(
>  	int			iclogs_changed)
>  {
>  	/*
> -	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
> -	 * are done writing the dummy record.  If we are done with the second
> -	 * dummy recored (DONE2), then we go to IDLE.
> +	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
> +	 * wrote the first covering record (DONE). We go to IDLE if we just
> +	 * wrote the second covering record (DONE2) and remain in IDLE until a
> +	 * non-covering write occurs.
>  	 */
>  	switch (prev_state) {
>  	case XLOG_STATE_COVER_IDLE:
> +		if (iclogs_changed == 1)
> +			return XLOG_STATE_COVER_IDLE;
>  	case XLOG_STATE_COVER_NEED:
>  	case XLOG_STATE_COVER_NEED2:
>  		break;
> -- 
> 2.26.2
>