Re: [PATCH 3/9] xfs: background AIL push targets physical space, not grant space

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Fri, 26 Aug 2022 08:47:35 -0700

On Tue, Aug 23, 2022 at 12:01:03PM +1000, Dave Chinner wrote:
> On Mon, Aug 22, 2022 at 12:00:03PM -0700, Darrick J. Wong wrote:
> > On Wed, Aug 10, 2022 at 09:03:47AM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > 
> > > Currently the AIL attempts to keep 25% of the "log space" free,
> > > where the current used space is tracked by the reserve grant head.
> > > That is, it tracks both physical space used plus the amount reserved
> > > by transactions in progress.
> > > 
> > > When we start tail pushing, we are trying to make space for new
> > > reservations by writing back older metadata and the log is generally
> > > physically full of dirty metadata, and reservations for modifications
> > > in flight take up whatever space the AIL can physically free up.
> > > 
> > > Hence we don't really need to take into account the reservation
> > > space that has been used - we just need to keep the log tail moving
> > > as fast as we can to free up space for more reservations to be made.
> > > We know exactly how much physical space the journal is consuming in
> > > the AIL (i.e. max LSN - min LSN) so we can base push thresholds
> > > directly on this state rather than have to look at grant head
> > > reservations to determine how much to physically push out of the
> > > log.
> > > 
> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > Makes sense, I think.  Though I was wondering about the last patch --
> > pushing the AIL until it's empty when a trans_alloc can't find grant
> > reservation could take a while on a slow storage.
> 
> The push in the grant reservation code is not a blocking push - it
> just tells the AIL to start pushing everything, then it goes to
> sleep waiting for the tail to move and space to come available. The
> AIL behaviour is largely unchanged, especially if the application is
> running under even slight memory pressure as the inode shrinker will
> repeatedly kick the AIL push-all trigger regardless of consumed
> journal/grant space.

Ok.

> > Does this mean that
> > we're trading the incremental freeing-up of the existing code for
> > potentially higher transaction allocation latency in the hopes that more
> > threads can get reservation?  Or does the "keep the AIL going" bits make
> > up for that?
> 
> So far I've typically measured slightly lower worst case latencies
> with this mechanism that with the existing "repeatedly push to 25%
> free" that we currently have. It's not really significant enough to
> make statements about (unlike cpu usage reductions or perf
> increases), but it does seem to be a bit better...

<nod>

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx