Re: [PATCH 03/45] xfs: separate CIL commit record IO

Chandan Babu R <chandanrlinux@xxxxxxxxx> · Mon, 08 Mar 2021 14:04:47 +0530

On 05 Mar 2021 at 10:41, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
>
> To allow for iclog IO device cache flush behaviour to be optimised,
> we first need to separate out the commit record iclog IO from the
> rest of the checkpoint so we can wait for the checkpoint IO to
> complete before we issue the commit record.
>
> This separation is only necessary if the commit record is being
> written into a different iclog to the start of the checkpoint as the
> upcoming cache flushing changes requires completion ordering against
> the other iclogs submitted by the checkpoint.
>
> If the entire checkpoint and commit is in the one iclog, then they
> are both covered by the one set of cache flush primitives on the
> iclog and hence there is no need to separate them for ordering.
>
> Otherwise, we need to wait for all the previous iclogs to complete
> so they are ordered correctly and made stable by the REQ_PREFLUSH
> that the commit record iclog IO issues. This guarantees that if a
> reader sees the commit record in the journal, they will also see the
> entire checkpoint that commit record closes off.
>
> This also provides the guarantee that when the commit record IO
> completes, we can safely unpin all the log items in the checkpoint
> so they can be written back because the entire checkpoint is stable
> in the journal.
>

I see that xlog_state_clean_iclog() wakes up tasks waiting on
iclog->ic_force_wait and that xlog_state_clean_iclog() itself is invoked after
the corresponding iclog is written to disk and the log vectors are moved to
AIL. Hence using iclog->ic_force_wait to wait for previous iclogs to complete
I/O ensures that the commit record iclog is written to disk only after the
previous iclogs have already been written.

Reviewed-by: Chandan Babu R <chandanrlinux@xxxxxxxxx>

--
chandan