On 2024/8/22 14:52, Miaohe Lin wrote:
On 2024/8/17 16:49, Kefeng Wang wrote:
The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") don't handle the hugetlb pages, the endless
loop still occur if offline a hwpoison hugetlb, luckly, with the
commit e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove
memory section with hwpoisoned hugepage") section with hwpoisoned
hugepage"), the HPageMigratable of hugetlb page will be clear, and
It should be commit e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section
with hwpoisoned hugepage")? Above "section with hwpoisoned")" is duplicated.
Also s/be clear/be cleared/ ?
Acked, thanks for carefully review.
the hwpoison hugetlb page will be skipped in scan_movable_pages(),
so the endless loop issue is fixed.
However if the HPageMigratable() check passed(without reference and
lock), the hugetlb page may be hwpoisoned, it won't cause issue since
the hwpoisoned page will be handled correctly in the next movable
pages scan loop, and it will be isolated in do_migrate_range() but
fails to migrate. In order to avoid the unnecessary isolation and
unify all hwpoisoned page handling, let's unconditionally check hwpoison
firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
the catch all safety net like normal page does.
Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
mm/memory_hotplug.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index dc19b0e28fbc..02a0d4fbc3fe 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1793,13 +1793,8 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
* but out loop could handle that as it revisits the split
* folio later.
*/
- if (folio_test_large(folio)) {
+ if (folio_test_large(folio))
pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
- if (folio_test_hugetlb(folio)) {
- isolate_hugetlb(folio, &source);
- continue;
- }
- }
/*
* HWPoison pages have elevated reference counts so the migration would
@@ -1808,11 +1803,17 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
* (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
* the unmap as the catch all safety net).
*/
- if (PageHWPoison(page)) {
+ if (folio_test_hwpoison(folio) ||
+ (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
if (WARN_ON(folio_test_lru(folio)))
folio_isolate_lru(folio);
if (folio_mapped(folio))
- try_to_unmap(folio, TTU_IGNORE_MLOCK);
+ unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
+ continue;
+ }
+
+ if (folio_test_hugetlb(folio)) {
+ isolate_hugetlb(folio, &source);
While you're here, should we pr_warn "failed to isolate pfn xx" for hugetlb folios too as
we already done for raw pages and thp folios?
We will unify folio isolation in final patch, which will print warn for
hugetlb folio when failed to isolate, so no need to add here.
Thanks.
.