On 2024/8/26 22:46, David Hildenbrand wrote:
On 17.08.24 10:49, Kefeng Wang wrote:
The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") don't handle the hugetlb pages, the endless
loop still occur if offline a hwpoison hugetlb, luckly, with the
commit e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove
memory section with hwpoisoned hugepage") section with hwpoisoned
hugepage"), the HPageMigratable of hugetlb page will be clear, and
the hwpoison hugetlb page will be skipped in scan_movable_pages(),
so the endless loop issue is fixed.
However if the HPageMigratable() check passed(without reference and
lock), the hugetlb page may be hwpoisoned, it won't cause issue since
the hwpoisoned page will be handled correctly in the next movable
pages scan loop, and it will be isolated in do_migrate_range() but
fails to migrate. In order to avoid the unnecessary isolation and
unify all hwpoisoned page handling, let's unconditionally check hwpoison
firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
the catch all safety net like normal page does.
Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
mm/memory_hotplug.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index dc19b0e28fbc..02a0d4fbc3fe 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1793,13 +1793,8 @@ static void do_migrate_range(unsigned long
start_pfn, unsigned long end_pfn)
* but out loop could handle that as it revisits the split
* folio later.
*/
- if (folio_test_large(folio)) {
+ if (folio_test_large(folio))
pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
- if (folio_test_hugetlb(folio)) {
- isolate_hugetlb(folio, &source);
- continue;
- }
- }
/*
* HWPoison pages have elevated reference counts so the
migration would
@@ -1808,11 +1803,17 @@ static void do_migrate_range(unsigned long
start_pfn, unsigned long end_pfn)
* (e.g. current hwpoison implementation doesn't unmap KSM
pages but keep
* the unmap as the catch all safety net).
*/
- if (PageHWPoison(page)) {
+ if (folio_test_hwpoison(folio) ||
+ (folio_test_large(folio) &&
folio_test_has_hwpoisoned(folio))) {
We have the exact same check already in mm/shmem.c now.
Likely this should be factored out ... but no idea what function name we
should use that won't add even more confusion :D
Maybe folio_has_hwpoison(), and Miaohe may have some suggestion,
but leave it for later.
if (WARN_ON(folio_test_lru(folio)))
folio_isolate_lru(folio);
if (folio_mapped(folio))
- try_to_unmap(folio, TTU_IGNORE_MLOCK);
+ unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
+ continue;
+ }
+
+ if (folio_test_hugetlb(folio)) {
+ isolate_hugetlb(folio, &source);
continue;
}
Acked-by: David Hildenbrand <david@xxxxxxxxxx>
Thanks.