On 13 Aug 2024, at 23:54, Yu Zhao wrote: > Use __GFP_COMP for gigantic folios to greatly reduce not only the > amount of code but also the allocation and free time. > > LOC (approximately): +60, -240 > > Allocate and free 500 1GB hugeTLB memory without HVO by: > time echo 500 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > > Before After > Alloc ~13s ~10s > Free ~15s <1s > > The above magnitude generally holds for multiple x86 and arm64 CPU > models. > > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx> > Reported-by: Frank van der Linden <fvdl@xxxxxxxxxx> > --- > include/linux/hugetlb.h | 9 +- > mm/hugetlb.c | 293 ++++++++-------------------------------- > 2 files changed, 62 insertions(+), 240 deletions(-) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 3100a52ceb73..98c47c394b89 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -896,10 +896,11 @@ static inline bool hugepage_movable_supported(struct hstate *h) > /* Movability of hugepages depends on migration support. */ > static inline gfp_t htlb_alloc_mask(struct hstate *h) > { > - if (hugepage_movable_supported(h)) > - return GFP_HIGHUSER_MOVABLE; > - else > - return GFP_HIGHUSER; > + gfp_t gfp = __GFP_COMP | __GFP_NOWARN; > + > + gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER; > + > + return gfp; > } > > static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 71d469c8e711..efa77ce87dcc 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -56,16 +56,6 @@ struct hstate hstates[HUGE_MAX_HSTATE]; > #ifdef CONFIG_CMA > static struct cma *hugetlb_cma[MAX_NUMNODES]; > static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; > -static bool hugetlb_cma_folio(struct folio *folio, unsigned int order) > -{ > - return cma_pages_valid(hugetlb_cma[folio_nid(folio)], &folio->page, > - 1 << order); > -} > -#else > -static bool hugetlb_cma_folio(struct folio *folio, unsigned int order) > -{ > - return false; > -} > #endif > static unsigned long hugetlb_cma_size __initdata; > > @@ -100,6 +90,17 @@ static void hugetlb_unshare_pmds(struct vm_area_struct *vma, > unsigned long start, unsigned long end); > static struct resv_map *vma_resv_map(struct vm_area_struct *vma); > > +static void hugetlb_free_folio(struct folio *folio) > +{ > +#ifdef CONFIG_CMA > + int nid = folio_nid(folio); > + > + if (cma_free_folio(hugetlb_cma[nid], folio)) > + return; > +#endif > + folio_put(folio); > +} > + It seems that we no longer use free_contig_range() to free gigantic folios from alloc_contig_range(). Will it work? Or did I miss anything? Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature