Re: [PATCH mm-unstable v1 2/3] mm/cma: add cma_alloc_folio()

Usama Arif <usamaarif642@xxxxxxxxx> · Thu, 15 Aug 2024 19:48:02 +0100

On 15/08/2024 19:25, Yu Zhao wrote:
> On Thu, Aug 15, 2024 at 12:04 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
>>
>> On Thu, Aug 15, 2024 at 8:41 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
>>>
>>>
>>>
>>> On 2024/8/12 5:21, Yu Zhao wrote:
>>>> With alloc_contig_range() and free_contig_range() supporting large
>>>> folios, CMA can allocate and free large folios too, by
>>>> cma_alloc_folio() and cma_release().
>>>>
>>>> Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
>>>> ---
>>>>   include/linux/cma.h |  1 +
>>>>   mm/cma.c            | 47 ++++++++++++++++++++++++++++++---------------
>>>>   2 files changed, 33 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/include/linux/cma.h b/include/linux/cma.h
>>>> index 9db877506ea8..086553fbda73 100644
>>>> --- a/include/linux/cma.h
>>>> +++ b/include/linux/cma.h
>>>> @@ -46,6 +46,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>>>>                                       struct cma **res_cma);
>>>>   extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
>>>>                             bool no_warn);
>>>> +extern struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
>>>>   extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count);
>>>>   extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
>>>>
>>>> diff --git a/mm/cma.c b/mm/cma.c
>>>> index 95d6950e177b..46feb06db8e7 100644
>>>> --- a/mm/cma.c
>>>> +++ b/mm/cma.c
>>>> @@ -403,18 +403,8 @@ static void cma_debug_show_areas(struct cma *cma)
>>>>       spin_unlock_irq(&cma->lock);
>>>>   }
>>>>
>>>> -/**
>>>> - * cma_alloc() - allocate pages from contiguous area
>>>> - * @cma:   Contiguous memory region for which the allocation is performed.
>>>> - * @count: Requested number of pages.
>>>> - * @align: Requested alignment of pages (in PAGE_SIZE order).
>>>> - * @no_warn: Avoid printing message about failed allocation
>>>> - *
>>>> - * This function allocates part of contiguous memory on specific
>>>> - * contiguous memory area.
>>>> - */
>>>> -struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>> -                    unsigned int align, bool no_warn)
>>>> +static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>>>> +                             unsigned int align, gfp_t gfp)
>>>>   {
>>>>       unsigned long mask, offset;
>>>>       unsigned long pfn = -1;
>>>> @@ -463,8 +453,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>>
>>>>               pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
>>>>               mutex_lock(&cma_mutex);
>>>> -             ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
>>>> -                                  GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
>>>> +             ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
>>>>               mutex_unlock(&cma_mutex);
>>>>               if (ret == 0) {
>>>>                       page = pfn_to_page(pfn);
>>>> @@ -494,7 +483,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>>                       page_kasan_tag_reset(nth_page(page, i));
>>>>       }
>>>>
>>>> -     if (ret && !no_warn) {
>>>> +     if (ret && !(gfp & __GFP_NOWARN)) {
>>>>               pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n",
>>>>                                  __func__, cma->name, count, ret);
>>>>               cma_debug_show_areas(cma);
>>>> @@ -513,6 +502,34 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>>       return page;
>>>>   }
>>>>
>>>> +/**
>>>> + * cma_alloc() - allocate pages from contiguous area
>>>> + * @cma:   Contiguous memory region for which the allocation is performed.
>>>> + * @count: Requested number of pages.
>>>> + * @align: Requested alignment of pages (in PAGE_SIZE order).
>>>> + * @no_warn: Avoid printing message about failed allocation
>>>> + *
>>>> + * This function allocates part of contiguous memory on specific
>>>> + * contiguous memory area.
>>>> + */
>>>> +struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>> +                    unsigned int align, bool no_warn)
>>>> +{
>>>> +     return __cma_alloc(cma, count, align, GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
>>>> +}
>>>> +
>>>> +struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
>>>> +{
>>>> +     struct page *page;
>>>> +
>>>> +     if (WARN_ON(order && !(gfp | __GFP_COMP)))
>>>> +             return NULL;
>>>> +
>>>> +     page = __cma_alloc(cma, 1 << order, order, gfp);
>>>> +
>>>> +     return page ? page_folio(page) : NULL;
>>>
>>> We don't set large_rmappable for cma alloc folio, which is not consistent
>>> with  other folio allocation, eg  folio_alloc/folio_alloc_mpol(),
>>> there is no issue for HugeTLB folio, and for HugeTLB folio must without
>>> large_rmappable, but once we use it for mTHP/THP, it need some extra
>>> handle, maybe we set large_rmappable here, and clear it in
>>> init_new_hugetlb_folio()?
>>
>> I want to hear what Matthew thinks about this.
>>
>> My opinion is that we don't want to couple largely rmappable (or
>> deferred splittable) with __GFP_COMP, and for that matter, with large
>> folios, because the former are specific to THPs whereas the latter can
>> potentially work for most types of high order allocations.
>>
>> Again, IMO, if we want to seriously answer the question of
>>   Can we get rid of non-compound multi-page allocations? [1]
>> then we should start planning on decouple large rmappable from the
>> generic folio allocation API.
>>
>> [1] https://lpc.events/event/18/sessions/184/#20240920
> 
> Also along the similar lines, Usama is trying to add
> PG_partially_mapped [1], which I have explicitly asked him not to
> introduce that flag to hugeTLB, unless there are good reasons (none
> ATM).
> 
> [1] https://lore.kernel.org/CAOUHufbmgwZwzUuHVvEDMqPGcsxE2hEreRZ4PhK5yz27GdK-Tw@xxxxxxxxxxxxxx/

PG_partially_mapped won't be cleared for hugeTLB in the next revision of the series as suggested by Yu.
Its not there in the fix patch I posted as well in  https://lore.kernel.org/all/4acdf2b7-ed65-4087-9806-8f4a187b4eb5@xxxxxxxxx/