Re: [PATCH 2/4] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/8/2 4:14, David Hildenbrand wrote:
On 25.07.24 03:16, Kefeng Wang wrote:
The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") don't handle the hugetlb pages, the dead loop
still occur if offline a hwpoison hugetlb, luckly, after the commit
e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
page will be clear, and the hwpoison hugetlb page will be skipped in
scan_movable_pages(), so the deed loop issue is fixed.

However if the HPageMigratable() check passed(without reference and
lock), the hugetlb page may be hwpoisoned, it won't cause issue since
the hwpoisoned page will be handled correctly in the next movable
pages scan loop, and it will be isolated in do_migrate_range() and
but fails to migrated. In order to avoid the unnecessary isolation and
unify all hwpoisoned page handling, let's unconditionally check hwpoison
firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
the catch all safety net like normal page does.

Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
  mm/memory_hotplug.c | 27 ++++++++++++++++-----------
  1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 66267c26ca1b..ccaf4c480aed 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
          folio = page_folio(page);
          head = &folio->page;
-        if (PageHuge(page)) {
-            pfn = page_to_pfn(head) + compound_nr(head) - 1;
-            isolate_hugetlb(folio, &source);
-            continue;
-        } else if (PageTransHuge(page))
-            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
-
          /*
           * HWPoison pages have elevated reference counts so the migration would            * fail on them. It also doesn't make any sense to migrate them in the            * first place. Still try to unmap such a page in case it is still mapped -         * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
-         * the unmap as the catch all safety net).
+         * (keep the unmap as the catch all safety net).
           */
-        if (PageHWPoison(page)) {
+        if (unlikely(PageHWPoison(page))) {
+            folio = page_folio(page);
+
              if (WARN_ON(folio_test_lru(folio)))
                  folio_isolate_lru(folio);
+
              if (folio_mapped(folio))
-                try_to_unmap(folio, TTU_IGNORE_MLOCK);
+                unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
+
+            if (folio_test_large(folio))
+                pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
              continue;
          }
+        if (PageHuge(page)) {
+            pfn = page_to_pfn(head) + compound_nr(head) - 1;
+            isolate_hugetlb(folio, &source);
+            continue;
+        } else if (PageTransHuge(page))
+            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;

If we can use a folio in the PageHWPoison() case, can we use one here as well? I know that it's all unreliable when not holding a folio reference, and we have to be a bit careful.

Using a folio here is part of patch4, I want to unify hugetlb/thp(or large folio) with "pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1" when large folio after get a ref.


It feels like using folios here would mostly be fine, because things like PageHuge() already use folios internally.

And using it in the PageHWPoison() but not here looks a bit odd.

We will convert to use folio in the following patch.


The important part is that we don't segfault if we'd overshoot our target.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux