Re: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23.01.23 21:37, Matthew Wilcox wrote:
On Mon, Jan 23, 2023 at 12:23:46PM -0800, Sidhartha Kumar wrote:
@@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
  			continue;
  		page = pfn_to_page(pfn);
  		folio = page_folio(page);
-		head = &folio->page;
- if (PageHuge(page)) {
-			pfn = page_to_pfn(head) + compound_nr(head) - 1;
+		if (folio_test_hugetlb(folio)) {
+			pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
  			isolate_hugetlb(folio, &source);
  			continue;
-		} else if (PageTransHuge(page))
-			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
+		} else if (folio_test_transhuge(folio))
+			pfn = folio_pfn(folio) + thp_nr_pages(page) - 1;

I'm pretty sure those two lines should be...

		} else if (folio_test_large(folio))
			pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;

But, erm ... we're doing this before we have a refcount on the page,
right?  So this is unsafe because the page might change which folio
it is in.  And the folio we found earlier might become a tail page
of a different folio.  (As the comment below explains, HWPoison pages
won't, so it's not unsafe for them).

Also, thp_nr_pages(page) is going to return 1 for tail pages.  So this
is a noop, unless page is a head page.

It's all a bit confusing, and being memory-hotplug, it's not well
tested.  More thought needed.

Ehm, it is fairly well tested ;)

As memory offlining keeps retrying, temporarily making wrong assumptions about a folio is acceptable, as long as we don't run into BUGs.

It's certainly worth a big comment in a code, that this is all racy and that page migration code will stabilize.

Now, we could temporarily take a reference, but ... common migration code will try taking its own ref to stabilize the page and would be confused about yet another ref (-> migration will fail).

So we have to be careful about grabbing references on these pages, and how long we're going to hold them. Otherwise we'll break memory offlining completely :)

--
Thanks,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux