On 4/28/21 5:26 AM, Muchun Song wrote: > On Wed, Apr 28, 2021 at 7:47 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >> >> Thanks! I will take a look at the modifications soon. >> >> I applied the patches to Andrew's mmotm-2021-04-21-23-03, ran some tests and >> got the following warning. We may need to special case that call to >> __prep_new_huge_page/free_huge_page_vmemmap from alloc_and_dissolve_huge_page >> as it is holding hugetlb lock with IRQs disabled. > > Good catch. Thanks Mike. I will fix it in the next version. How about this: > > @@ -1618,7 +1617,8 @@ static void __prep_new_huge_page(struct hstate > *h, struct page *page) > > static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) > { > + free_huge_page_vmemmap(h, page); > __prep_new_huge_page(page); > spin_lock_irq(&hugetlb_lock); > __prep_account_new_huge_page(h, nid); > spin_unlock_irq(&hugetlb_lock); > @@ -2429,6 +2429,7 @@ static int alloc_and_dissolve_huge_page(struct > hstate *h, struct page *old_page, > if (!new_page) > return -ENOMEM; > > + free_huge_page_vmemmap(h, new_page); > retry: > spin_lock_irq(&hugetlb_lock); > if (!PageHuge(old_page)) { > @@ -2489,7 +2490,7 @@ static int alloc_and_dissolve_huge_page(struct > hstate *h, struct page *old_page, > > free_new: > spin_unlock_irq(&hugetlb_lock); > - __free_pages(new_page, huge_page_order(h)); > + update_and_free_page(h, new_page, false); > > return ret; > } > > Another option would be to leave the prep* routines as is and only modify alloc_and_dissolve_huge_page as follows: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9c617c19fc18..f8e5013a6b46 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2420,14 +2420,15 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, /* * Before dissolving the page, we need to allocate a new one for the - * pool to remain stable. Using alloc_buddy_huge_page() allows us to - * not having to deal with prep_new_huge_page() and avoids dealing of any - * counters. This simplifies and let us do the whole thing under the - * lock. + * pool to remain stable. Here, we allocate the page and 'prep' it + * by doing everything but actually updating counters and adding to + * the pool. This simplifies and let us do most of the processing + * under the lock. */ new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL); if (!new_page) return -ENOMEM; + __prep_new_huge_page(h, new_page); retry: spin_lock_irq(&hugetlb_lock); @@ -2473,7 +2474,6 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, * Reference count trick is needed because allocator gives us * referenced page but the pool requires pages with 0 refcount. */ - __prep_new_huge_page(h, new_page); __prep_account_new_huge_page(h, nid); page_ref_dec(new_page); enqueue_huge_page(h, new_page); @@ -2489,7 +2489,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, free_new: spin_unlock_irq(&hugetlb_lock); - __free_pages(new_page, huge_page_order(h)); + update_and_free_page(h, old_page, false); return ret; } -- Mike Kravetz