Re: [PATCH 2/4] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/7/30 18:26, David Hildenbrand wrote:
On 25.07.24 03:16, Kefeng Wang wrote:
The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") don't handle the hugetlb pages, the dead loop
still occur if offline a hwpoison hugetlb, luckly, after the commit
e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
page will be clear, and the hwpoison hugetlb page will be skipped in
scan_movable_pages(), so the deed loop issue is fixed.

did you mean "endless loop" ?

Exactly, will fix the words.



However if the HPageMigratable() check passed(without reference and
lock), the hugetlb page may be hwpoisoned, it won't cause issue since
the hwpoisoned page will be handled correctly in the next movable
pages scan loop, and it will be isolated in do_migrate_range() and
but fails to migrated. In order to avoid the unnecessary isolation and
unify all hwpoisoned page handling, let's unconditionally check hwpoison
firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
the catch all safety net like normal page does.

But what's the benefit here besides slightly faster handling in an absolute corner case (I strongly suspect that we don't care)?

Yes, it is a very corner case, the goal is to move isolate_hugetlb()
after HWpoison check, then to unify isolation and folio conversion
(patch4). But we must correctly handle the hugetlb unmap when meet
a hwpoisoned page.



Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
  mm/memory_hotplug.c | 27 ++++++++++++++++-----------
  1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 66267c26ca1b..ccaf4c480aed 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
          folio = page_folio(page);
          head = &folio->page;
-        if (PageHuge(page)) {
-            pfn = page_to_pfn(head) + compound_nr(head) - 1;
-            isolate_hugetlb(folio, &source);
-            continue;
-        } else if (PageTransHuge(page))
-            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
-
          /*
           * HWPoison pages have elevated reference counts so the migration would            * fail on them. It also doesn't make any sense to migrate them in the            * first place. Still try to unmap such a page in case it is still mapped -         * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
-         * the unmap as the catch all safety net).
+         * (keep the unmap as the catch all safety net).
           */
-        if (PageHWPoison(page)) {
+        if (unlikely(PageHWPoison(page))) {

We're not checking the head page here, will this work reliably for hugetlb? (I recall some difference in per-page hwpoison handling between hugetlb and THP due to the vmemmap optimization)

Before this changes, the hwposioned hugetlb page won't try to unmap in
do_migrate_range(), we hope it already unmapped in memory_failure(), as mentioned from comments, there maybe fail to unmap, so a new safeguard to try to unmap it again here, but we don't need to guarantee it.

The unmap_posioned_folio() used to correctly handle hugetlb pages in shared mappings if we met a hwpoisoned page(maybe headpage/may subpage).







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux