From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> When hugetlb page fault (under overcommiting situation) and memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race: CPU0: CPU1: gather_surplus_pages() page = alloc_surplus_huge_page() memory_failure_hugetlb() get_hwpoison_page(page) __get_hwpoison_page(page) get_page_unless_zero(page) zero = put_page_testzero(page) VM_BUG_ON_PAGE(!zero, page) enqueue_huge_page(h, page) put_page(page) __get_hwpoison_page() only checks page refcount before taking additional one for memory error handling, which is wrong because there's time windows where compound pages have non-zero refcount during initialization. So makes __get_hwpoison_page() check page status a bit more for a few types of compound pages. PageSlab() check is added because otherwise "non anonymous thp" path is wrongly chosen. Fixes: ead07f6a867b ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling") Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> Reported-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx # 5.12+ --- changelog v3: - recheck PageHuge after holding hugetlb_lock, --- mm/memory-failure.c | 55 ++++++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git v5.12/mm/memory-failure.c v5.12_patched/mm/memory-failure.c index a3659619d293..02668b24e512 100644 --- v5.12/mm/memory-failure.c +++ v5.12_patched/mm/memory-failure.c @@ -1095,30 +1095,43 @@ static int __get_hwpoison_page(struct page *page) { struct page *head = compound_head(page); - if (!PageHuge(head) && PageTransHuge(head)) { - /* - * Non anonymous thp exists only in allocation/free time. We - * can't handle such a case correctly, so let's give it up. - * This should be better than triggering BUG_ON when kernel - * tries to touch the "partially handled" page. - */ - if (!PageAnon(head)) { - pr_err("Memory failure: %#lx: non anonymous thp\n", - page_to_pfn(page)); - return 0; + if (PageCompound(page)) { + if (PageSlab(page)) { + return get_page_unless_zero(page); + } else if (PageHuge(head)) { + int ret = 0; + + spin_lock(&hugetlb_lock); + if (!PageHuge(head)) + ret = -EBUSY; + else if (HPageFreed(head) || HPageMigratable(head)) + ret = get_page_unless_zero(head); + spin_unlock(&hugetlb_lock); + return ret; + } else if (PageTransHuge(head)) { + /* + * Non anonymous thp exists only in allocation/free time. We + * can't handle such a case correctly, so let's give it up. + * This should be better than triggering BUG_ON when kernel + * tries to touch the "partially handled" page. + */ + if (!PageAnon(head)) { + pr_err("Memory failure: %#lx: non anonymous thp\n", + page_to_pfn(page)); + return 0; + } + if (get_page_unless_zero(head)) { + if (head == compound_head(page)) + return 1; + pr_info("Memory failure: %#lx cannot catch tail\n", + page_to_pfn(page)); + put_page(head); + } } + return 0; } - if (get_page_unless_zero(head)) { - if (head == compound_head(page)) - return 1; - - pr_info("Memory failure: %#lx cannot catch tail\n", - page_to_pfn(page)); - put_page(head); - } - - return 0; + return get_page_unless_zero(page); } /* -- 2.25.1