On Tue, Sep 06, 2022 at 04:14:40PM +0800, Miaohe Lin wrote: > On 2022/9/6 14:14, HORIGUCHI NAOYA(堀口 直也) wrote: > > On Tue, Sep 06, 2022 at 10:59:58AM +0800, Miaohe Lin wrote: > >> On 2022/9/5 14:21, Naoya Horiguchi wrote: > >>> From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > >>> > >>> HWPoisoned page is not supposed to be accessed once marked, but currently > >>> such accesses can happen during memory hotremove because do_migrate_range() > >>> can be called before dissolve_free_huge_pages() is called. > >>> > >>> Move dissolve_free_huge_pages() before scan_movable_pages(). Recently > >>> delayed dissolve has been implemented, so the dissolving can turn > >>> a hwpoisoned hugepage into 4kB hwpoison page, which memory hotplug can > >>> handle safely. > >> > >> Yes, thanks for your work, Naoya. ;) > >> > >>> > >>> Reported-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> > >>> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > >>> --- > >>> mm/memory_hotplug.c | 22 +++++++++++----------- > >>> 1 file changed, 11 insertions(+), 11 deletions(-) > >>> > >>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > >>> index fad6d1f2262a..c24735d63b25 100644 > >>> --- a/mm/memory_hotplug.c > >>> +++ b/mm/memory_hotplug.c > >>> @@ -1880,6 +1880,17 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, > >>> > >>> cond_resched(); > >>> > >>> + /* > >>> + * Dissolve free hugepages in the memory block before doing > >>> + * offlining actually in order to make hugetlbfs's object > >>> + * counting consistent. > >>> + */ > >>> + ret = dissolve_free_huge_pages(start_pfn, end_pfn); > >>> + if (ret) { > >>> + reason = "failure to dissolve huge pages"; > >>> + goto failed_removal_isolated; > >>> + } > >> > >> This change has a side-effect. If hugetlb pages are in-use, dissolve_free_huge_pages() will always return -EBUSY > >> even if those pages can be migrated. So we fail to hotremove the memory even if they could be offlined. > >> Or am I miss something? > > > > Thank you for the comment, you're right. (Taking a look over my test result > > carefully, it showed failures for the related cases, I somehow overlooked > > them, really sorry.) So my second thought is that we keep offline_pages() > > as is, and insert a few line in do_migrate_range() to handle the case of > > hwpoisoned hugepage like below: > > > > @@ -1642,6 +1642,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) > > > > if (PageHuge(page)) { > > pfn = page_to_pfn(head) + compound_nr(head) - 1; > > + if (PageHWPoison(head)) > > + continue; > > Thanks for your update. But it seems this is not enough. With the above code change, HWPoisoned > hugetlb pages will always be ignored in do_migrate_range(). And if these pages are HPageMigratable, > they will be returned in scan_movable_pages() then passed into the do_migrate_range() again. Thus > a possible deadloop will occur until these pages become un-movable? Yeah, so scan_movable_pages() can have an additional check for hwpoisoned hugepages, or making hwpoisoned hugepage to be !HPageMigratable (somehow) might be another option. I like the latter one for now, but I need look into how I can update the patch more. > > > isolate_hugetlb(head, &source); > > continue; > > } else if (PageTransHuge(page)) > > > > This is slightly different from your original suggestion > > https://lore.kernel.org/linux-mm/20220421135129.19767-1-linmiaohe@xxxxxxxxxx/T > > , as discussed in the thread existing "if (PageHWPoison(page))" branch in > > this function can't be used for hugetlb. We could adjust them to handle > > hugetlb, but maybe separating code for hugetlb first from the others looks > > less compicated to me. > > It might be better to do something, e.g. unmap the hugetlb pages to prevent accessing from process if mapped, > even try truncating the error page from pagecache. But I have no strong opinion as handling memory failure > would always be a best try. ;) This could be helpful, I'll try some. Thank you for valuable comments. - Naoya Horiguchi > > Thanks, > Miaohe Lin > > > > > > If you have any suggestion on this, please let me know. > > > > Thanks, > > Naoya Horiguchi > >