Re: [PATCH mm-unstable v2 3/3] mm/hugetlb: use __GFP_COMP for gigantic folios

Yu Zhao <yuzhao@xxxxxxxxxx> · Fri, 16 Aug 2024 10:02:27 -0600

On Fri, Aug 16, 2024 at 9:23 AM Zi Yan <ziy@xxxxxxxxxx> wrote:
>
> On 13 Aug 2024, at 23:54, Yu Zhao wrote:
>
> > Use __GFP_COMP for gigantic folios to greatly reduce not only the
> > amount of code but also the allocation and free time.
> >
> > LOC (approximately): +60, -240
> >
> > Allocate and free 500 1GB hugeTLB memory without HVO by:
> >   time echo 500 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> >   time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> >
> >        Before  After
> > Alloc  ~13s    ~10s
> > Free   ~15s    <1s
> >
> > The above magnitude generally holds for multiple x86 and arm64 CPU
> > models.
> >
> > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> > Reported-by: Frank van der Linden <fvdl@xxxxxxxxxx>
> > ---
> >  include/linux/hugetlb.h |   9 +-
> >  mm/hugetlb.c            | 293 ++++++++--------------------------------
> >  2 files changed, 62 insertions(+), 240 deletions(-)
> >
> > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> > index 3100a52ceb73..98c47c394b89 100644
> > --- a/include/linux/hugetlb.h
> > +++ b/include/linux/hugetlb.h
> > @@ -896,10 +896,11 @@ static inline bool hugepage_movable_supported(struct hstate *h)
> >  /* Movability of hugepages depends on migration support. */
> >  static inline gfp_t htlb_alloc_mask(struct hstate *h)
> >  {
> > -     if (hugepage_movable_supported(h))
> > -             return GFP_HIGHUSER_MOVABLE;
> > -     else
> > -             return GFP_HIGHUSER;
> > +     gfp_t gfp = __GFP_COMP | __GFP_NOWARN;
> > +
> > +     gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
> > +
> > +     return gfp;
> >  }
> >
> >  static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask)
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 71d469c8e711..efa77ce87dcc 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -56,16 +56,6 @@ struct hstate hstates[HUGE_MAX_HSTATE];
> >  #ifdef CONFIG_CMA
> >  static struct cma *hugetlb_cma[MAX_NUMNODES];
> >  static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata;
> > -static bool hugetlb_cma_folio(struct folio *folio, unsigned int order)
> > -{
> > -     return cma_pages_valid(hugetlb_cma[folio_nid(folio)], &folio->page,
> > -                             1 << order);
> > -}
> > -#else
> > -static bool hugetlb_cma_folio(struct folio *folio, unsigned int order)
> > -{
> > -     return false;
> > -}
> >  #endif
> >  static unsigned long hugetlb_cma_size __initdata;
> >
> > @@ -100,6 +90,17 @@ static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
> >               unsigned long start, unsigned long end);
> >  static struct resv_map *vma_resv_map(struct vm_area_struct *vma);
> >
> > +static void hugetlb_free_folio(struct folio *folio)
> > +{
> > +#ifdef CONFIG_CMA
> > +     int nid = folio_nid(folio);
> > +
> > +     if (cma_free_folio(hugetlb_cma[nid], folio))
> > +             return;
> > +#endif
> > +     folio_put(folio);
> > +}
> > +
>
> It seems that we no longer use free_contig_range() to free gigantic
> folios from alloc_contig_range().

We switched to two pairs of extern (to the allocator) APIs in this patch:
  folio_alloc_gigantic()
  folio_put()
and
  cma_alloc_folio()
  cma_free_folio()

> Will it work? Or did I miss anything?

alloc_contig_range and free_contig_range() also works with __GFP_COMP
/ large folios, but this pair is internal (to the allocator) and
shouldn't be used directly except to implement external APIs like
above.