Re: [PATCH 4/8] xfs: l_last_sync_lsn is really tracking AIL state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 08, 2022 at 11:55:54AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> The current implementation of xlog_assign_tail_lsn() assumes that
> when the AIL is empty, the log tail matches the LSN of the last
> written commit record. This is recorded in xlog_state_set_callback()
> as log->l_last_sync_lsn when the iclog state changes to
> XLOG_STATE_CALLBACK. This change is then immediately followed by
> running the callbacks on the iclog which then insert the log items
> into the AIL at the "commit lsn" of that checkpoint.
> 
> The "commit lsn" of the checkpoint is the LSN of the iclog that the
> commit record is place in. This is also the same iclog that the
> callback is attached to. Hence the log->l_last_sync_lsn is set to
> the same value as the commit lsn recorded for the checkpoint being
> inserted into the AIL and it effectively tracks the highest ever LSN
> that the AIL has seen.
> 
> This value is not used directly by the log. It was used to determine
> the maximum push threshold for AIL pushing to ensure we didn't set
> a push target larger than had been seen in the AIL, but the AIL now
> only needs to look at it's own internal state to set push targets.
> 
> The log itself tracks it's current head via the {current_lsn,
> current_block} tuple, so it doesn't need l_last_sync_lsn for
> tracking the head of the log.  Hence nothing actually uses
> log->l_last_sync_lsn except for the tail assignment when the AIL is
> empty.
> 
> This "highest commit LSN" really doesn't need to be tracked by the
> log - the max LSN seen by the AIL can be easily tracked by the AIL
> and it can be used to set the log tail LSN correctly when the last
> item is removed from the AIL.

Ok, there's a fundamental flaw in this patch: the AIL doesn't track
the commit record LSN of log items, it tracks the *start record LSN* of the
checkpoint.

l_last_sync_lsn is the highest commit record lsn seen by the log,
whilst ailp->ail_max_seen_lsn tracks the highest start record lsn.
The difference between the two is the size of the checkpoint, and
this means by the end of the patch the grant head space that is
tracked does not have the space used by the last checkpoint
committed removed from it.

i.e. grant head space won't go to zero even when the CIL is
empty and there are no transactions in flight - the log needs to be
covered for it to drop to the size of the last iclog written to the
log during covering operations.

This causes problems with small logs - there may be space in the log
for new reservations, but because the grant space doesn't get
decremented by a log force flushing the CIL correctly,
xlog_space_left() doesn't actually report them as free and so
reservations stall until the log is covered some 30s later. Then the
same thing happens again short while later...

Essentially, this patch needs to be reworked - l_last_sync_lsn needs
to stay, but the location that is updated at needs to change to the
CIL commit callbacks so we can update it while holding the AIL lock
and then it doesn't needs to be an atomic variable....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux