The possible bad scenario: CPU0: CPU1: gather_surplus_pages() page = alloc_surplus_huge_page() memory_failure_hugetlb() get_hwpoison_page(page) __get_hwpoison_page(page) get_page_unless_zero(page) zero = put_page_testzero(page) VM_BUG_ON_PAGE(!zero, page) enqueue_huge_page(h, page) put_page(page) The refcount can possibly be increased by memory-failure or soft_offline handlers, we can trigger VM_BUG_ON_PAGE and wrongly add the page to the hugetlb pool list. Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> --- mm/hugetlb.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3476aa06da70..6c96332db34b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2145,17 +2145,14 @@ static int gather_surplus_pages(struct hstate *h, long delta) /* Free the needed pages to the hugetlb pool */ list_for_each_entry_safe(page, tmp, &surplus_list, lru) { - int zeroed; - if ((--needed) < 0) break; /* - * This page is now managed by the hugetlb allocator and has - * no users -- drop the buddy allocator's reference. + * The refcount can possibly be increased by memory-failure or + * soft_offline handlers. */ - zeroed = put_page_testzero(page); - VM_BUG_ON_PAGE(!zeroed, page); - enqueue_huge_page(h, page); + if (likely(put_page_testzero(page))) + enqueue_huge_page(h, page); } free: spin_unlock_irq(&hugetlb_lock); -- 2.11.0