On Fri, Aug 26, 2022 at 08:47:35AM -0700, Darrick J. Wong wrote: > On Tue, Aug 23, 2022 at 12:01:03PM +1000, Dave Chinner wrote: > > On Mon, Aug 22, 2022 at 12:00:03PM -0700, Darrick J. Wong wrote: > > > On Wed, Aug 10, 2022 at 09:03:47AM +1000, Dave Chinner wrote: > > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > > > Currently the AIL attempts to keep 25% of the "log space" free, > > > > where the current used space is tracked by the reserve grant head. > > > > That is, it tracks both physical space used plus the amount reserved > > > > by transactions in progress. > > > > > > > > When we start tail pushing, we are trying to make space for new > > > > reservations by writing back older metadata and the log is generally > > > > physically full of dirty metadata, and reservations for modifications > > > > in flight take up whatever space the AIL can physically free up. > > > > > > > > Hence we don't really need to take into account the reservation > > > > space that has been used - we just need to keep the log tail moving > > > > as fast as we can to free up space for more reservations to be made. > > > > We know exactly how much physical space the journal is consuming in > > > > the AIL (i.e. max LSN - min LSN) so we can base push thresholds > > > > directly on this state rather than have to look at grant head > > > > reservations to determine how much to physically push out of the > > > > log. > > > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > Makes sense, I think. Though I was wondering about the last patch -- > > > pushing the AIL until it's empty when a trans_alloc can't find grant > > > reservation could take a while on a slow storage. Now that I've had a chance to see where we're going... Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > > > > The push in the grant reservation code is not a blocking push - it > > just tells the AIL to start pushing everything, then it goes to > > sleep waiting for the tail to move and space to come available. The > > AIL behaviour is largely unchanged, especially if the application is > > running under even slight memory pressure as the inode shrinker will > > repeatedly kick the AIL push-all trigger regardless of consumed > > journal/grant space. > > Ok. > > > > Does this mean that > > > we're trading the incremental freeing-up of the existing code for > > > potentially higher transaction allocation latency in the hopes that more > > > threads can get reservation? Or does the "keep the AIL going" bits make > > > up for that? > > > > So far I've typically measured slightly lower worst case latencies > > with this mechanism that with the existing "repeatedly push to 25% > > free" that we currently have. It's not really significant enough to > > make statements about (unlike cpu usage reductions or perf > > increases), but it does seem to be a bit better... > > <nod> > > --D > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@xxxxxxxxxxxxx