On 2022/9/21 17:13, Naoya Horiguchi wrote: > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > > HWPoisoned page is not supposed to be accessed once marked, but currently > such accesses can happen during memory hotremove because do_migrate_range() > can be called before dissolve_free_huge_pages() is called. > > Clear HPageMigratable for hwpoisoned hugepages to prevent them from being > migrated. This should be done in hugetlb_lock to avoid race against > isolate_hugetlb(). > > get_hwpoison_huge_page() needs to have a flag to show it's called from > unpoison to take refcount of hwpoisoned hugepages, so add it. > > Reported-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> Thanks for your work, Naoya. Maybe something to improve below. > --- > ChangeLog v2 -> v3 > - move to the approach of clearing HPageMigratable instead of shifting > dissolve_free_huge_pages. > --- > include/linux/hugetlb.h | 4 ++-- > mm/hugetlb.c | 4 ++-- > mm/memory-failure.c | 12 ++++++++++-- > 3 files changed, 14 insertions(+), 6 deletions(-) > <snip> > @@ -7267,7 +7267,7 @@ int get_hwpoison_huge_page(struct page *page, bool *hugetlb) > *hugetlb = true; > if (HPageFreed(page)) > ret = 0; > - else if (HPageMigratable(page)) > + else if (HPageMigratable(page) || unpoison) Is unpoison_memory() expected to restore the HPageMigratable flag as well ? > ret = get_page_unless_zero(page); > else > ret = -EBUSY; > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 145bb561ddb3..5942e1c0407e 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1244,7 +1244,7 @@ static int __get_hwpoison_page(struct page *page, unsigned long flags) > int ret = 0; > bool hugetlb = false; > > - ret = get_hwpoison_huge_page(head, &hugetlb); > + ret = get_hwpoison_huge_page(head, &hugetlb, false); > if (hugetlb) > return ret; > > @@ -1334,7 +1334,7 @@ static int __get_unpoison_page(struct page *page) > int ret = 0; > bool hugetlb = false; > > - ret = get_hwpoison_huge_page(head, &hugetlb); > + ret = get_hwpoison_huge_page(head, &hugetlb, true); > if (hugetlb) > return ret; > > @@ -1815,6 +1815,13 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) > goto out; > } > > + /* > + * Clearing HPageMigratable for hwpoisoned hugepages to prevent them > + * from being migrated by memory hotremove. > + */ > + if (count_increased) > + ClearHPageMigratable(head); I believe this can prevent hwpoisoned hugepages from being migrated though there still be some windows. > + > return ret; > out: > if (count_increased) > @@ -1862,6 +1869,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb > > if (hwpoison_filter(p)) { > hugetlb_clear_page_hwpoison(head); > + SetHPageMigratable(head); Would we set HPageMigratable flag for free hugetlb pages here? IIUC, they're not expected to have this flag set. Thanks, Miaohe Lin > unlock_page(head); > if (res == 1) > put_page(head); >