On Fri, Sep 06, 2019 at 10:05:46AM +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > In the situation where the log is full and the CIL has not recently > flushed, the AIL push threshold is throttled back to the where the > last write of the head of the log was completed. This is stored in > log->l_last_sync_lsn. Hence if the CIL holds > 25% of the log space > pinned by flushes and/or aggregation in progress, we can get the > situation where the head of the log lags a long way behind the > reservation grant head. > > When this happens, the AIL push target is trimmed back from where > the reservation grant head wants to push the log tail to, back to > where the head of the log currently is. This means the push target > doesn't reach far enough into the log to actually move the tail > before the transaction reservation goes to sleep. > > When the CIL push completes, it moves the log head forward such that > the AIL push target can now be moved, but that has no mechanism for > puhsing the log tail. Further, if the next tail movement of the log > is not large enough wake the waiter (i.e. still not enough space for > it to have a reservation granted), we don't wake anything up, and > hence we do not update the AIL push target to take into account the > head of the log moving and allowing the push target to be moved > forwards. > > To avoid this particular condition, if we fail to wake the first > waiter on the grant head because we don't have enough space, > push on the AIL again. This will pick up any movement of the log > head and allow the push target to move forward due to completion of > CIL pushing. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/xfs_log.c | 29 ++++++++++++++++++++++++++++- > 1 file changed, 28 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > index b159a9e9fef0..c2862b1a2780 100644 > --- a/fs/xfs/xfs_log.c > +++ b/fs/xfs/xfs_log.c > @@ -214,15 +214,42 @@ xlog_grant_head_wake( > { > struct xlog_ticket *tic; > int need_bytes; > + bool woken_task = false; > > list_for_each_entry(tic, &head->waiters, t_queue) { > + > + /* > + * The is a chance that the size of the CIL checkpoints in "There is a chance..." Other than that, Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> --D > + * progress at the last AIL push target calculation resulted in > + * limiting the target to the log head (l_last_sync_lsn) at the > + * time. This may not reflect where the log head is now as the > + * CIL checkpoints may have completed. > + * > + * Hence when we are woken here, it may be that the head of the > + * log that has moved rather than the tail. As the tail didn't > + * move, there still won't be space available for the > + * reservation we require. However, if the AIL has already > + * pushed to the target defined by the old log head location, we > + * will hang here waiting for something else to update the AIL > + * push target. > + * > + * Therefore, if there isn't space to wake the first waiter on > + * the grant head, we need to push the AIL again to ensure the > + * target reflects both the current log tail and log head > + * position before we wait for the tail to move again. > + */ > + > need_bytes = xlog_ticket_reservation(log, head, tic); > - if (*free_bytes < need_bytes) > + if (*free_bytes < need_bytes) { > + if (!woken_task) > + xlog_grant_push_ail(log, need_bytes); > return false; > + } > > *free_bytes -= need_bytes; > trace_xfs_log_grant_wake_up(log, tic); > wake_up_process(tic->t_task); > + woken_task = true; > } > > return true; > -- > 2.23.0.rc1 >