Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

Will Deacon <will@xxxxxxxxxx> · Thu, 16 Jul 2020 09:12:43 +0100

On Wed, Jul 15, 2020 at 09:59:24AM -0700, Mike Kravetz wrote:
> On 7/15/20 1:18 AM, Will Deacon wrote:
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index f24acb3af741..a0007d1d12d2 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -3273,6 +3273,9 @@ void __init hugetlb_add_hstate(unsigned int order)
> >>  	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
> >>  					huge_page_size(h)/1024);
> > 
> > (nit: you can also make hugetlb_cma_reserve() static and remote its function
> > prototypes from hugetlb.h)
> 
> Yes thanks.  I threw this together pretty quickly.
> 
> > 
> >> +	if (order >= MAX_ORDER && hugetlb_cma_size)
> >> +		hugetlb_cma_reserve(order);
> > 
> > Although I really like the idea of moving this out of the arch code, I don't
> > quite follow the check against MAX_ORDER here -- it looks like a bit of a
> > hack to try to intercept the "PUD_SHIFT - PAGE_SHIFT" order which we
> > currently pass to hugetlb_cma_reserve(). Maybe we could instead have
> > something like:
> > 
> > 	#ifndef HUGETLB_CMA_ORDER
> > 	#define HUGETLB_CMA_ORDER	(PUD_SHIFT - PAGE_SHIFT)
> > 	#endif
> > 
> > and then just do:
> > 
> > 	if (order == HUGETLB_CMA_ORDER)
> > 		hugetlb_cma_reserve(order);
> > 
> > ? Is there something else I'm missing?
> > 
> 
> Well, the current hugetlb CMA code only kicks in for gigantic pages as
> defined by the hugetlb code. For example, the code to allocate a page
> from CMA is in the routine alloc_gigantic_page().  alloc_gigantic_page()
> is called from alloc_fresh_huge_page() which starts with:
> 
>         if (hstate_is_gigantic(h))
>                 page = alloc_gigantic_page(h, gfp_mask, nid, nmask);
>         else
>                 page = alloc_buddy_huge_page(h, gfp_mask,
>                                 nid, nmask, node_alloc_noretry);
> 
> and, hstate_is_gigantic is,
> 
> static inline bool hstate_is_gigantic(struct hstate *h)
> {
>         return huge_page_order(h) >= MAX_ORDER;
> }
> 
> So, everything in the existing code really depends on the hugetlb definition
> of gigantic page (order >= MAX_ORDER).  The code to check for
> 'order >= MAX_ORDER' in my proposed patch is just following the same
> convention.

Fair enough, and thanks for the explanation. Maybe just chuck a comment in,
then? Alternatively, having something like:

static inline bool page_order_is_gigantic(unsigned int order)
{
	return order >= MAX_ORDER;
}

static inline bool hstate_is_gigantic(struct hstate *h)
{
	return page_order_is_gigantic(huge_page_order(h));
}

and then using page_order_is_gigantic() to predicate the call to
hugetlb_cma_reserve? Dunno, maybe it's overkill. Up to you.

> I think the current dependency on the hugetlb definition of gigantic page
> may be too simplistic if using CMA for huegtlb pages becomes more common.
> Some architectures (sparc, powerpc) have more than one gigantic pages size.
> Currently there is no way to specify that CMA should be used for one and
> not the other.  In addition, I could imagine someone wanting to reserve/use
> CMA for non-gigantic (PMD) sized pages.  There is no mechainsm for that today.
> 
> I honestly have not heard about many use cases for this CMA functionality.
> When support was initially added, it was driven by a specific use case and
> the 'all gigantic pages use CMA if defined' implementation was deemed
> sufficient.  If there are more use cases, or this seems too simple we can
> revisit that decision.

Agreed, I think your patch is an improvement regardless of that.

Will