On 15/08/2024 19:25, Yu Zhao wrote: > On Thu, Aug 15, 2024 at 12:04 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: >> >> On Thu, Aug 15, 2024 at 8:41 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote: >>> >>> >>> >>> On 2024/8/12 5:21, Yu Zhao wrote: >>>> With alloc_contig_range() and free_contig_range() supporting large >>>> folios, CMA can allocate and free large folios too, by >>>> cma_alloc_folio() and cma_release(). >>>> >>>> Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx> >>>> --- >>>> include/linux/cma.h | 1 + >>>> mm/cma.c | 47 ++++++++++++++++++++++++++++++--------------- >>>> 2 files changed, 33 insertions(+), 15 deletions(-) >>>> >>>> diff --git a/include/linux/cma.h b/include/linux/cma.h >>>> index 9db877506ea8..086553fbda73 100644 >>>> --- a/include/linux/cma.h >>>> +++ b/include/linux/cma.h >>>> @@ -46,6 +46,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, >>>> struct cma **res_cma); >>>> extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align, >>>> bool no_warn); >>>> +extern struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp); >>>> extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count); >>>> extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count); >>>> >>>> diff --git a/mm/cma.c b/mm/cma.c >>>> index 95d6950e177b..46feb06db8e7 100644 >>>> --- a/mm/cma.c >>>> +++ b/mm/cma.c >>>> @@ -403,18 +403,8 @@ static void cma_debug_show_areas(struct cma *cma) >>>> spin_unlock_irq(&cma->lock); >>>> } >>>> >>>> -/** >>>> - * cma_alloc() - allocate pages from contiguous area >>>> - * @cma: Contiguous memory region for which the allocation is performed. >>>> - * @count: Requested number of pages. >>>> - * @align: Requested alignment of pages (in PAGE_SIZE order). >>>> - * @no_warn: Avoid printing message about failed allocation >>>> - * >>>> - * This function allocates part of contiguous memory on specific >>>> - * contiguous memory area. >>>> - */ >>>> -struct page *cma_alloc(struct cma *cma, unsigned long count, >>>> - unsigned int align, bool no_warn) >>>> +static struct page *__cma_alloc(struct cma *cma, unsigned long count, >>>> + unsigned int align, gfp_t gfp) >>>> { >>>> unsigned long mask, offset; >>>> unsigned long pfn = -1; >>>> @@ -463,8 +453,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, >>>> >>>> pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); >>>> mutex_lock(&cma_mutex); >>>> - ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, >>>> - GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0)); >>>> + ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp); >>>> mutex_unlock(&cma_mutex); >>>> if (ret == 0) { >>>> page = pfn_to_page(pfn); >>>> @@ -494,7 +483,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, >>>> page_kasan_tag_reset(nth_page(page, i)); >>>> } >>>> >>>> - if (ret && !no_warn) { >>>> + if (ret && !(gfp & __GFP_NOWARN)) { >>>> pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n", >>>> __func__, cma->name, count, ret); >>>> cma_debug_show_areas(cma); >>>> @@ -513,6 +502,34 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, >>>> return page; >>>> } >>>> >>>> +/** >>>> + * cma_alloc() - allocate pages from contiguous area >>>> + * @cma: Contiguous memory region for which the allocation is performed. >>>> + * @count: Requested number of pages. >>>> + * @align: Requested alignment of pages (in PAGE_SIZE order). >>>> + * @no_warn: Avoid printing message about failed allocation >>>> + * >>>> + * This function allocates part of contiguous memory on specific >>>> + * contiguous memory area. >>>> + */ >>>> +struct page *cma_alloc(struct cma *cma, unsigned long count, >>>> + unsigned int align, bool no_warn) >>>> +{ >>>> + return __cma_alloc(cma, count, align, GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0)); >>>> +} >>>> + >>>> +struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) >>>> +{ >>>> + struct page *page; >>>> + >>>> + if (WARN_ON(order && !(gfp | __GFP_COMP))) >>>> + return NULL; >>>> + >>>> + page = __cma_alloc(cma, 1 << order, order, gfp); >>>> + >>>> + return page ? page_folio(page) : NULL; >>> >>> We don't set large_rmappable for cma alloc folio, which is not consistent >>> with other folio allocation, eg folio_alloc/folio_alloc_mpol(), >>> there is no issue for HugeTLB folio, and for HugeTLB folio must without >>> large_rmappable, but once we use it for mTHP/THP, it need some extra >>> handle, maybe we set large_rmappable here, and clear it in >>> init_new_hugetlb_folio()? >> >> I want to hear what Matthew thinks about this. >> >> My opinion is that we don't want to couple largely rmappable (or >> deferred splittable) with __GFP_COMP, and for that matter, with large >> folios, because the former are specific to THPs whereas the latter can >> potentially work for most types of high order allocations. >> >> Again, IMO, if we want to seriously answer the question of >> Can we get rid of non-compound multi-page allocations? [1] >> then we should start planning on decouple large rmappable from the >> generic folio allocation API. >> >> [1] https://lpc.events/event/18/sessions/184/#20240920 > > Also along the similar lines, Usama is trying to add > PG_partially_mapped [1], which I have explicitly asked him not to > introduce that flag to hugeTLB, unless there are good reasons (none > ATM). > > [1] https://lore.kernel.org/CAOUHufbmgwZwzUuHVvEDMqPGcsxE2hEreRZ4PhK5yz27GdK-Tw@xxxxxxxxxxxxxx/ PG_partially_mapped won't be cleared for hugeTLB in the next revision of the series as suggested by Yu. Its not there in the fix patch I posted as well in https://lore.kernel.org/all/4acdf2b7-ed65-4087-9806-8f4a187b4eb5@xxxxxxxxx/