Re: [PATCH 11/11] xfs: flush speculative space allocations when we run out of space

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Mon, 25 Jan 2021 16:29:01 -0800

On Mon, Jan 25, 2021 at 04:06:28PM -0500, Brian Foster wrote:
> On Mon, Jan 25, 2021 at 12:02:16PM -0800, Darrick J. Wong wrote:
> > On Sun, Jan 24, 2021 at 09:48:16AM +0000, Christoph Hellwig wrote:
> > > > +retry:
> > > >  	/*
> > > >  	 * Allocate the handle before we do our freeze accounting and setting up
> > > >  	 * GFP_NOFS allocation context so that we avoid lockdep false positives
> > > > @@ -285,6 +289,22 @@ xfs_trans_alloc(
> > > >  	tp->t_firstblock = NULLFSBLOCK;
> > > >  
> > > >  	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > > +	if (error == -ENOSPC && tries > 0) {
> > > > +		xfs_trans_cancel(tp);
> > > > +
> > > > +		/*
> > > > +		 * We weren't able to reserve enough space for the transaction.
> > > > +		 * Flush the other speculative space allocations to free space.
> > > > +		 * Do not perform a synchronous scan because callers can hold
> > > > +		 * other locks.
> > > > +		 */
> > > > +		error = xfs_blockgc_free_space(mp, NULL);
> > > > +		if (error)
> > > > +			return error;
> > > > +
> > > > +		tries--;
> > > > +		goto retry;
> > > > +	}
> > > >  	if (error) {
> > > >  		xfs_trans_cancel(tp);
> > > >  		return error;
> > > 
> > > Why do we need to restart the whole function?  A failing
> > > xfs_trans_reserve should restore tp to its initial state, and keeping
> > > the SB_FREEZE_FS counter increased also doesn't look harmful as far as

I'm curious about your motivation for letting transaction nest here.
Seeing as the ENOSPC return should be infrequent, are you simply not
wanting to cycle the memory allocators and the FREEZE_FS counters?

Hm.  I guess at this point the only resources we hold are the FREEZE_FS
counter and *tp itself.  The transaction doesn't have any log space
grants or block reservation associated with it, and I guess we're not in
PF_MEMALLOC_NOFS mode either.  So I guess this is ok, except...

> > > I can tell.  So why not:
> > > 
> > > 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > 	if (error == -ENOSPC) {
> > > 		/*
> > > 		 * We weren't able to reserve enough space for the transaction.
> > > 		 * Flush the other speculative space allocations to free space.
> > > 		 * Do not perform a synchronous scan because callers can hold
> > > 		 * other locks.
> > > 		 */
> > > 		error = xfs_blockgc_free_space(mp, NULL);
> > 
> > xfs_blockgc_free_space runs the blockgc scan directly, which means that
> > it creates transactions to free blocks.  Since we can't have nested
> > transactions, we have to drop tp here.
> > 
> 
> Technically, I don't think it's a problem to hold a transaction memory
> allocation (and superblock write access?) while diving into the scanning
> mechanism.

...except that doing so will collide with what we've been telling Yafang
(as part of his series to detect nested transactions) as far as when is
the appropriate time to set current->journal_info/PF_MEMALLOC_NOFS.

> BTW, this also looks like a landmine passing a NULL eofb into
> the xfs_blockgc_free_space() tracepoint.

Errk, will fix that.

--D

> Brian
> 
> > --D
> > 
> > > 		if (error)
> > > 			return error;
> > > 		error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > 	}
> > >  	if (error) {
> > >   		xfs_trans_cancel(tp);
> > >   		return error;
> > > 
> > > ?
> > 
>