Re: [PATCH v14 3/8] drm/ttm/pool: Provide a helper to shrink pages

Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> · Tue, 03 Dec 2024 17:31:30 +0100

On Tue, 2024-12-03 at 17:20 +0100, Christian König wrote:
> [SNIP]
> > > > > > @@ -453,9 +601,36 @@ int ttm_pool_alloc(struct ttm_pool
> > > > > > *pool,
> > > > > > struct ttm_tt *tt,
> > > > > >     	else
> > > > > >     		gfp_flags |= GFP_HIGHUSER;
> > > > > >     
> > > > > > -	for (order = min_t(unsigned int, MAX_PAGE_ORDER,
> > > > > > __fls(num_pages));
> > > > > > -	     num_pages;
> > > > > > -	     order = min_t(unsigned int, order,
> > > > > > __fls(num_pages)))
> > > > > > {
> > > > > > +	order = min_t(unsigned int, MAX_PAGE_ORDER,
> > > > > > __fls(num_pages));
> > > > > > +
> > > > > > +	if (tt->page_flags & TTM_TT_FLAG_PRIV_BACKED_UP) {
> > > > > > +		if (!tt->restore) {
> > > > > > +			gfp_t gfp = GFP_KERNEL |
> > > > > > __GFP_NOWARN;
> > > > > > +
> > > > > > +			if (ctx->gfp_retry_mayfail)
> > > > > > +				gfp |=
> > > > > > __GFP_RETRY_MAYFAIL;
> > > > > > +
> > > > > > +			tt->restore =
> > > > > > +				kvzalloc(struct_size(tt-
> > > > > > > restore,
> > > > > > old_pages,
> > > > > > +						    
> > > > > > (size_t)1
> > > > > > <<
> > > > > > order), gfp);
> > > > > > +			if (!tt->restore)
> > > > > > +				return -ENOMEM;
> > > > > > +		} else if (ttm_pool_restore_valid(tt-
> > > > > > > restore)) {
> > > > > > +			struct ttm_pool_tt_restore
> > > > > > *restore =
> > > > > > tt-
> > > > > > > restore;
> > > > > > +
> > > > > > +			num_pages -= restore-
> > > > > > >alloced_pages;
> > > > > > +			order = min_t(unsigned int, order,
> > > > > > __fls(num_pages));
> > > > > > +			pages += restore->alloced_pages;
> > > > > > +			r = ttm_pool_restore_tt(restore,
> > > > > > tt-
> > > > > > > backup, ctx);
> > > > > > +			if (r)
> > > > > > +				return r;
> > > > > > +			caching = restore->caching_divide;
> > > > > > +		}
> > > > > > +
> > > > > > +		tt->restore->pool = pool;
> > > > > > +	}
> > > > > Hui? Why is that part of the allocation function now?
> > > > > 
> > > > > At bare minimum I would expect that this is a new function.
> > > > It's because we now have partially backed up tts, so the
> > > > restore is
> > > > interleaved on a per-page basis, replacing the backup handles
> > > > with
> > > > page-pointers. I'll see if I can separate out at least the
> > > > initialization here.
> > > Yeah, that kind of makes sense.
> > > 
> > > My expectation was just that we now have explicit
> > > ttm_pool_swapout()
> > > and
> > > ttm_pool_swapin() functions.
> > I fully understand, although in the allocation step, that would
> > also
> > increase the memory pressure since we might momentarily have twice
> > the
> > bo-size allocated, if the shmem object was never swapped out, and
> > we
> > don't want to unnecessarily risc OOM at recover time, although that
> > should be a recoverable situation now. If the OOM receiver can free
> > up
> > system memory resources they can could potentially restart the
> > recover.
> 
> What I meant was more that we have ttm_pool_swapout() which does a
> mix 
> of moving each page to a swap backend and freeing one by one.
> 
> And ttm_pool_swapin() which allocates a bit of memory (usually one
> huge 
> page) and then copies the content back in from the swap backend.
> 
> Alternatively we could rename ttm_pool_alloc() into something like 
> ttm_pool_populate() and ttm_pool_free() into ttm_pool_unpopulate(),
> but 
> those names are not very descriptive either.
> 
> It's just that we now do a bit more than just alloc and free in those
> functions, so the naming doesn't really match that well any more.

So what about ttm_pool_alloc() and ttm_pool_recover/swapin(), both
pointing to the same code, but _alloc() asserts that the tt isn't
backed up?

That would give a clean interface at least.

For a renaming change that touch all TTM drivers, I'd rather put that
as a last patch since getting acks for that from all TTM driver
maintainers seems like a hopeless undertaking.

/Thomas

> 
> Christian.
> 
> > 
> > /Thomas