Am 03.12.24 um 17:43 schrieb Thomas Hellström:
On Tue, 2024-12-03 at 17:39 +0100, Christian König wrote:Am 03.12.24 um 17:31 schrieb Thomas Hellström:On Tue, 2024-12-03 at 17:20 +0100, Christian König wrote:[SNIP]@@ -453,9 +601,36 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, else gfp_flags |= GFP_HIGHUSER;- for (order = min_t(unsigned int,MAX_PAGE_ORDER, __fls(num_pages)); - num_pages; - order = min_t(unsigned int, order, __fls(num_pages))) { + order = min_t(unsigned int, MAX_PAGE_ORDER, __fls(num_pages)); + + if (tt->page_flags & TTM_TT_FLAG_PRIV_BACKED_UP) { + if (!tt->restore) { + gfp_t gfp = GFP_KERNEL | __GFP_NOWARN; + + if (ctx->gfp_retry_mayfail) + gfp |= __GFP_RETRY_MAYFAIL; + + tt->restore = + kvzalloc(struct_size(t t-restore,old_pages, + (size_t)1 << order), gfp); + if (!tt->restore) + return -ENOMEM; + } else if (ttm_pool_restore_valid(tt-restore)) {+ struct ttm_pool_tt_restore *restore = tt-restore;+ + num_pages -= restore-alloced_pages;+ order = min_t(unsigned int, order, __fls(num_pages)); + pages += restore-alloced_pages;+ r = ttm_pool_restore_tt(restore, tt-backup, ctx);+ if (r) + return r; + caching = restore-caching_divide;+ } + + tt->restore->pool = pool; + }Hui? Why is that part of the allocation function now? At bare minimum I would expect that this is a new function.It's because we now have partially backed up tts, so the restore is interleaved on a per-page basis, replacing the backup handles with page-pointers. I'll see if I can separate out at least the initialization here.Yeah, that kind of makes sense. My expectation was just that we now have explicit ttm_pool_swapout() and ttm_pool_swapin() functions.I fully understand, although in the allocation step, that would also increase the memory pressure since we might momentarily have twice the bo-size allocated, if the shmem object was never swapped out, and we don't want to unnecessarily risc OOM at recover time, although that should be a recoverable situation now. If the OOM receiver can free up system memory resources they can could potentially restart the recover.What I meant was more that we have ttm_pool_swapout() which does a mix of moving each page to a swap backend and freeing one by one. And ttm_pool_swapin() which allocates a bit of memory (usually one huge page) and then copies the content back in from the swap backend. Alternatively we could rename ttm_pool_alloc() into something like ttm_pool_populate() and ttm_pool_free() into ttm_pool_unpopulate(), but those names are not very descriptive either. It's just that we now do a bit more than just alloc and free in those functions, so the naming doesn't really match that well any more.So what about ttm_pool_alloc() and ttm_pool_recover/swapin(), both pointing to the same code, but _alloc() asserts that the tt isn't backed up? That would give a clean interface at least.More or less ok. I would just put figuring out the gfp flags and the stuff inside the for (order... loop into separate functions. And then remove the if (tt->page_flags & TTM_TT_FLAG_PRIV_BACKED_UP) from the pool. In other words you trigger the back restore by calling a different function than the allocation one.I'll take a look at this as well.
Ah, and BTW: It's perfectly possible that ttm_tt_free() is called because a halve swapped TT is about to be destroyed!
If I'm not completely mistaken that is not handled gracefully when we try to always backup from in the ttm_tt_free() function.
So we clearly need the separation of move this TT to a backup (and eventually only partially) and freeing it.
Christian.
/ThomasFor a renaming change that touch all TTM drivers, I'd rather put that as a last patch since getting acks for that from all TTM driver maintainers seems like a hopeless undertaking.Yeah the acks are not the problem, merging it through the xe tree would be. Christian./ThomasChristian./Thomas