Re: [RFC] TTM shrinking revisited

Christian König <christian.koenig@xxxxxxx> · Mon, 9 Jan 2023 20:49:28 +0100

Am 09.01.23 um 10:14 schrieb Thomas Hellström:
Hi, Christian,

Thanks for the feedback. Some additional inline comments and questions:

On 1/4/23 11:31, Christian König wrote:
Am 30.12.22 um 12:11 schrieb Thomas Hellström:
Hi, Christian, others.

I'm starting to take a look at the TTM shrinker again. We'll 
probably be
needing it at least for supporting integrated hardware with the xe 
driver.

So assuming that the last attempt failed because of the need to 
allocate
shmem pages and lack of writeback at shrink time, I was thinking of the
following approach: (A rough design sketch of the core support for the
last bullet is in patch 1/1. It of course needs polishing if the 
interface
is at all accepted by the mm people).

Before embarking on this, any feedback or comments would be greatly
appreciated:

*) Avoid TTM swapping when no swap space is available. Better to 
adjust the
    TTM swapout watermark, as no pages can be freed to the system 
anyway.
*) Complement the TTM swapout watermark with a shrinker.
    For cached pages, that may hopefully remove the need for the 
watermark.
    Possibly a watermark needs to remain for wc pages and / or dma 
pages,
    depending on how well shrinking them works.

Yeah, that's what I've already tried and failed miserable exactly 
because of what you described above.

Do you have a test-case for this or a typical failing scenario I can 
turn into a kunit test, to motivate the need for direct 
insert-to-swap-cache before running it with the -mm people? It will 
otherwise have a high risk of being NAKed, I fear.

No real test case, but Piglit has a test where an application tries to 
allocate texture which gets bigger and bigger until we run into an ENOMEM.

Without the 50% limit we crash pretty easily in an OOM situation.

*) Trigger immediate writeback of pages handed to the swapcache / 
shmem,
    at least when the shrinker is called from kswapd.

Not sure if that's really valuable.
Not completely sure either. However, in OOM situations where we need 
to allocate memory to be able to shrink, that would give the system a 
chance to reclaim the pages we shrink before we deplete the kernel 
reserves completely. Shmem does this, and also the i915 shrinker in 
some situations, but I agree it needs to be verified to be valuable 
and if so, in what situations.

*) Hide ttm_tt_swap[out|in] details in the ttm_pool code. In the 
pool code
    we have more details about the backing pages and can split pages,
    transition caching state and copy as necessary. Also investigate 
the
    possibility of reusing pool pages in a smart way if copying is 
needed.

Well I think we don't need to split pages at all. The higher order 
pages are just allocated for better TLB utilization and could (in 
theory) be freed as individual pages as well. It's just that MM 
doesn't support that atm.

If we can insert pages directly into the swap-cache, splitting might 
be needed, at least if compound pages were allocated to begin with. 
Looks like shmem does this as well before inserting into the 
swap-cache. Could be a corner case where the system theoretically 
supports swapping PMD size pages, but when no PMD size slots are 
available. (My system behaves like that, need to investigate why).

Mhm, sounds like my understanding of the swap-cache is completely 
outdated. Not much of a surprise, it was more than a decade ago that I 
last looked into this.

Christian.

Thanks,

Thomas

But I really like the idea of moving more of this logic into the 
ttm_pool.

*) See if we can directly insert pages into the swap-cache instead of
    taking the shmem detour, something along with the attached patch 
1 RFC.

Yeah, that strongly looks like we way to go. Maybe in combination 
with being able to swap WC/UC pages directly out.

While swapping them in again an extra copy doesn't hurt us, but for 
the other way that really sucks.

Thanks,
Christian.

Thanks,
Thomas