Re: [PATCH] mm/huge_memory: fix swap entry values of tail pages of THP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14.02.24 15:34, David Hildenbrand wrote:
On 14.02.24 15:18, Matthew Wilcox wrote:
On Wed, Feb 14, 2024 at 12:04:10PM +0530, Charan Teja Kalla wrote:
1) Is it broken in 5.15? Did you actually try to reproduce or is this
     just a guess?


We didn't run the tests with THP enabled on 5.15, __so we didn't
encounter this issue__ on older to 6.1 kernels.

I mentioned that issue exists is based on my understanding after code
walk through. To be specific, I just looked to the
migrate_pages()->..->migrate_page_move_mapping() &
__split_huge_page_tail() where the ->private field of thp sub-pages is
not filled with swap entry. If it could have set, I think these are the
only places where it would have done, per my understanding. CMIW.

I think you have a misunderstanding.  David's patch cfeed8ffe55b (part
of 6.6) _stopped_ us using the tail ->private entries.  So in 6.1, these
tail pages should already have page->private set, and I don't understand
what you're fixing.

I think the issue is, that migrate_page_move_mapping() /
folio_migrate_mapping() would update ->private for a folio in the
swapcache (head page)

	newfolio->private = folio_get_private(folio);

but not the ->private of the tail pages.

So once you migrate a THP that is in the swapcache, ->private of the
tail pages would not be migrated and, therefore, be stale/wrong.

Even before your patch that was the case.

Looking at migrate_page_move_mapping(), we had:

	if (PageSwapBacked(page)) {
		__SetPageSwapBacked(newpage);
		if (PageSwapCache(page)) {
			SetPageSwapCache(newpage);
			set_page_private(newpage, page_private(page));
		}
	} else {
		VM_BUG_ON_PAGE(PageSwapCache(page), page);
	}


I don't immediately see where the tail pages would similarly get updated
(via set_page_private).

With my patch the problem is gone, because the tail page entries don't
have to be migrated, because they are unused.


Maybe this was an oversight from THP_SWAP -- 38d8b4e6bdc8 ("mm, THP,
swap: delay splitting THP during swap out").

It did update __add_to_swap_cache():

for (i = 0; i < nr; i++) {
          set_page_private(page + i, entry.val + i);
          error = radix_tree_insert(&address_space->page_tree,
                                    idx + i, page + i);
          if (unlikely(error))
                  break;
}

and similarly __delete_from_swap_cache().

But I don't see any updates to migration code.

Now, it could be that THP migration was added later (post 2017), in that
case the introducing commit would not have been 38d8b4e6bdc8.


Let's continue:

The introducing commit is likely either

(1) 38d8b4e6bdc87 ("mm, THP, swap: delay splitting THP during swap out")

That one added THP_SWAP, but THP migration wasn't supported yet AFAIKS.

-> v4.13

(2) 616b8371539a6 ("mm: thp: enable thp migration in generic path")

Or likely any of the following that actually allocate THP for migration:

8135d8926c08e mm: memory_hotplug: memory hotremove supports thp migration
e8db67eb0ded3 mm: migrate: move_pages() supports thp migration
c8633798497ce mm: mempolicy: mbind and migrate_pages support thp migration

That actually enable THP migration.

-> v4.14


So likely we'd have to fix the stable kernels:

4.19
5.4
5.10
5.15
6.1

That's a lot of pre-folio code. A backport of my series likely won't really make any sense.

Staring at 4.19.307 code base, we likely have to perform a stable-only fix that properly handles the swapcache of compoud pages in migrate_page_move_mapping().

Ugly.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux