Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

Barry Song <baohua@xxxxxxxxxx> · Thu, 13 Jun 2024 21:43:48 +1200

On Thu, Jun 13, 2024 at 8:49 PM Zhai He <zhai.he@xxxxxxx> wrote:
>
> > -----Original Message-----
> > From: Barry Song <baohua@xxxxxxxxxx>
> > Sent: Thursday, June 13, 2024 3:38 PM
> > To: Zhai He <zhai.he@xxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; sboyd@xxxxxxxxxx;
> > linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Zhipeng Wang
> > <zhipeng.wang_1@xxxxxxx>; Jindong Yue <jindong.yue@xxxxxxx>; Christoph
> > Hellwig <hch@xxxxxx>
> > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> > device-specified CMA memory is not enough.
> >
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments. When in doubt, report the message using the 'Report this
> > email' button
> >
> >
> > On Thu, Jun 13, 2024 at 7:11 PM Zhai He <zhai.he@xxxxxxx> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Barry Song <baohua@xxxxxxxxxx>
> > > > Sent: Thursday, June 13, 2024 2:15 PM
> > > > To: Zhai He <zhai.he@xxxxxxx>
> > > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; sboyd@xxxxxxxxxx;
> > > > linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Zhipeng Wang
> > > > <zhipeng.wang_1@xxxxxxx>; Jindong Yue <jindong.yue@xxxxxxx>;
> > > > Christoph Hellwig <hch@xxxxxx>
> > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > > when the device-specified CMA memory is not enough.
> > > >
> > > > Caution: This is an external email. Please take care when clicking
> > > > links or opening attachments. When in doubt, report the message
> > > > using the 'Report this email' button
> > > >
> > > >
> > > > On Thu, Jun 13, 2024 at 5:32 PM Zhai He <zhai.he@xxxxxxx> wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Barry Song <baohua@xxxxxxxxxx>
> > > > > > Sent: Thursday, June 13, 2024 11:28 AM
> > > > > > To: Zhai He <zhai.he@xxxxxxx>
> > > > > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; sboyd@xxxxxxxxxx;
> > > > > > linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Zhipeng Wang
> > > > > > <zhipeng.wang_1@xxxxxxx>; Jindong Yue <jindong.yue@xxxxxxx>;
> > > > > > Christoph Hellwig <hch@xxxxxx>
> > > > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > > CMA when the device-specified CMA memory is not enough.
> > > > > >
> > > > > > Caution: This is an external email. Please take care when
> > > > > > clicking links or opening attachments. When in doubt, report the
> > > > > > message using the 'Report this email' button
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <zhai.he@xxxxxxx> wrote:
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Barry Song <baohua@xxxxxxxxxx>
> > > > > > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > > > > > To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > > > > > > Cc: Zhai He <zhai.he@xxxxxxx>; sboyd@xxxxxxxxxx;
> > > > > > > > linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > > > > > > stable@xxxxxxxxxxxxxxx; Zhipeng Wang
> > > > > > > > <zhipeng.wang_1@xxxxxxx>; Jindong Yue <jindong.yue@xxxxxxx>;
> > > > > > > > Christoph Hellwig <hch@xxxxxx>
> > > > > > > > Subject: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > > > > CMA when the device-specified CMA memory is not enough.
> > > > > > > >
> > > > > > > > Caution: This is an external email. Please take care when
> > > > > > > > clicking links or opening attachments. When in doubt, report
> > > > > > > > the message using the 'Report this email' button
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > > > > > <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he"
> > > > > > > > > <zhai.he@xxxxxxx>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > From: He Zhai <zhai.he@xxxxxxx>
> > > > > > > > >
> > > > > > > > > (cc Barry & Christoph)
> > > > > > > > >
> > > > > > > > > What was your reason for adding cc:stable to the email headers?
> > > > > > > > > Does this address some serious problem?  If so, please
> > > > > > > > > fully describe that problem.
> > > > > > > > >
> > > > > > > > > > In the current code logic, if the device-specified CMA
> > > > > > > > > > memory allocation fails, memory will not be allocated
> > > > > > > > > > from the
> > > > default CMA area.
> > > > > > > > > > This patch will use the default cma region when the
> > > > > > > > > > device's specified CMA is not enough.
> > > > > > > > > >
> > > > > > > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > > > > > > Because these logs will be printed when memory
> > > > > > > > > > allocation from the device specified CMA fails, but if
> > > > > > > > > > the allocation fails, it will be allocated from the
> > > > > > > > > > default cma area. It can easily mislead
> > > > > > developers'
> > > > > > > > > > judgment.
> > > > > > > >
> > > > > > > > I am not convinced that this patch is correct. If
> > > > > > > > device-specific CMA is too small, why not increase it in the device
> > tree?
> > > > > > > > Conversely, if the default CMA size is too large, why not
> > > > > > > > reduce it via the cmdline?  CMA offers all kinds of flexible
> > > > > > > > configuration options based
> > > > > > on users’ needs.
> > > > > > > >
> > > > > > > > One significant benefit of device-specific CMA is that it
> > > > > > > > helps decrease fragmentation in the common CMA pool. While
> > > > > > > > many devices allocate memory from the same pool, they have
> > > > > > > > different memory requirements in terms of sizes and
> > > > > > > > alignments. Occasions of memory allocation and release can
> > > > > > > > lead to situations where the CMA pool has enough free space,
> > > > > > > > yet someone fails to obtain
> > > > contiguous memory from it.
> > > > > > > >
> > > > > > > > This patch entirely negates the advantage we gain from
> > > > > > > > device-specific
> > > > CMA.
> > > > > > > > My point is that instead of modifying the core code, please
> > > > > > > > consider correcting your device tree or cmdline configurations.
> > > > > > > >
> > > > > > > Because we enabled secure heap to support widevine DRM, and
> > > > > > > secure heap requires security configuration, its starting
> > > > > > > address cannot be specified arbitrarily, which causes the
> > > > > > > default CMA to be reduced. So we
> > > > > > reduced the CMA, but in order to avoid the impact of reducing
> > > > > > the CMA, we used a multi-segment CMA and gave one segment to the
> > VPU.
> > > > > > >
> > > > > > > However, under our memory configuration, the device-specific
> > > > > > > CMA is not
> > > > > > enough to support the VPU decoding high-resolution code streams,
> > > > > > so this patch is added so that the VPU can work properly.
> > > > > > > Thanks.
> > > > > >
> > > > > > I don’t quite understand what you are saying. Why can’t you
> > > > > > increase VPU’s CMA size?
> > > > > Thanks for your quick reply.
> > > > > Because we added a secure heap to support Widevine DRM, this heap
> > > > requires hardware protection, so its starting address cannot be
> > > > specified arbitrarily. This causes the secure heap to occupy part of
> > > > the default CMA, and the default CMA is therefore reduced, so in
> > > > order to avoid default CMA Shrinking introduces other problems. We
> > > > added a specific CMA area for the VPU. However, due to the large
> > > > size of the secure heap and default CMA, There is no remaining memory
> > available to increase specific CMA for VPU.
> > > >
> > > > I assume the secure heap you are referring to is a section of memory
> > > > that should only be accessed by TrustZone and not be visible to
> > > > Linux running in non-secure mode. How do you allocate this secure heap
> > from the default CMA?
> > >
> > > No, secure heap is a reserved memory, secure heap is not allocated from CMA,
> > secure heap has been reserved during the kernel startup phase.
> > > And this reserved memory is protected by hardware. Only specific hardware
> > and secure world can accessed it.
> > > For example:
> > > &{/reserved-memory/} {
> > >         secure_region: secure {
> > >                 compatible = "imx-secure-ion-pool";
> > >                 reg = <0x0 0xA0000000 0 0x1EF00000>;
> > >         };
> > > };
> > >
> > > > Do you use the cma_alloc() APIs or the dma_alloc_coherent() APIs?
> > > > Given that the VPU has its own device-specific CMA, why is this
> > > > secure heap allocated from the default CMA instead of the VPU's CMA?
> > > >
> > > The VPU driver will use dma_alloc_coherent() to allocate contiguous memory.
> > The secure heap is not allocated from the CMA, but because the secure heap is
> > enabled, it occupies some contiguous memory, causing the default CMA to be
> > reduced.
> > >
> > > > If this secure heap was allocated before the kernel booted, why did
> > > > the kernel(your dts) fail to mark this area as nomap/reserved to
> > > > prevent the default CMA from intersecting with it?
> > > >
> > > Secure heap does not intersect with the CMA.
> > > for example:
> > > before secure heap enabled:
> > > 0xA000 0000 ~ 0xFFFFFFFF: default CMA
> > > after secure heap enabled:
> > > 0x9000 0000 ~0x9FFF FFFF is the CMA specified by VPU,
> > > 0xA000 0000 ~0xAFFF FFFF is secure heap, (the start address cannot be
> > specified arbitrarily, because this memory is protected by hardware, if the start
> > address is 0x9000 0000, uboot will use this memory, but uboot can't access this
> > memory because of hardware protection. So we find a section of memory that
> > UBOOT will not use as secure heap.
> > > Note: The memory of uboot can be adjusted, but avoiding the secure
> > > heap will limit the memory range that uboot can use, causing problems
> > > such as the uboot stack)
> > > 0xB000 0000 ~0xFFFFFFFF is default CMA.
> > > So default CMA is reduced.
> >
> > How is that related to your patch? I assume the default CMA is reduced because
> > you modified it in the DTS after enabling the secure heap, as the CMA size is set
> > by you. The default CMA size won't automatically decrease due to the secure
> > heap. To me, 0xB0000000-0xFFFFFFFF(1.25GiB) is still too large a CMA.
> >
>
> Sorry, This example is just an example. In fact, the size of our default CMA is less than 1.25GiB.
> Our current memory distribution is as follows. Now the size of "c" (default CMA) could not meet the needs of our requirement. And "b" (reserved memory for secure) is fixed, so we couldn't expand "c" (default CMA) through modify DTS. Then we reserved "a" (specific CMA) for VPU. However, we have confirmed with the multimedia team that the maximum size required is It is uncertain, so specify "a" for VPU to use first, and "c" for other devices that require continuous memory. If "a" is not enough, use "c".
> That's the purpose of this patch.
>         --------------------------------------------------
>         | a. VPU specific cma             |
>         --------------------------------------------------
>         | b. reserved memory for secure    |
>         ---------------------------------------------------
>     | c. default CMA                 |
>     ---------------------------------------------------
> > >

Ok, I understand your problem. Because B is enabled, you can't have C as large
as you would without B. So you add A, but A might have insufficient space for
high-resolution video. You also don't know the exact size needed for A. In the
corner case of encoding/decoding high-resolution video, A might be insufficient,
forcing you to borrow memory from C.  Because B is situated between A and C,
creating a gap, you cannot merge A and C into a single default CMA.

This does indeed seem like a valid requirement. Please ensure to
clearly describe
the problem next time. However, as a general rule, allowing device-specific CMAs
to borrow from the default CMA is not advisable. This would undermine
the reasons
why we started supporting device-specific CMAs in the first place.

So the problem is that memory holes may prevent the formation of large CMAs.

This situation raises a question: can we have two or more default CMAs due to
memory holes like B, which might hinder the system from obtaining a sufficiently
large default CMA?

If you could define both A and C as default CMAs, you wouldn't require
these kinds of
fallbacks. Is it possible for you to make dma_contiguous_default_area
a list rather
than a global variant ?

Another option is that we allow devices to have more than one memory-region,
we can use device tree to fallback.
memory-region = <&mem1, &mem2>;

My perspective is that I acknowledge your problem as a valid requirement.
However, I find the approach to be too aggressive.

> > > > >
> > > > > > It seems you mean that only in some corner cases do you need a
> > > > > > large CMA, but most of the time, you don’t need it to be this
> > > > > > big? So you have
> > > > to "borrow"
> > > > > > memory from the
> > > > > > default CMA. but why not move that portion from the default CMA
> > > > > > to your VPU’s CMA?
> > > > > >
> > > > > This is a method, but because for VPU, the continuous memory size
> > > > > allocated
> > > > by the driver is based on the video stream, we cannot determine the
> > > > maximum size of memory required by the VPU. This makes it impossible
> > > > for us to determine the size of the specific CMA assigned to the VPU. Thanks.
> > > >
> > > > I don't understand how this can happen. You should precisely know
> > > > the maximum size required for the VPU based on your multimedia
> > > > pipeline and resolutions.
> > > >
> > > We cannot estimate the maximum contiguous memory required by the VPU
> > because it depends on how the video is encoded.
> > > Thanks very much.
> >
> > Yes, you can. Please ask your multimedia team; they will give you a number.
> >
> > >
> > > > I still don't understand your scenarios or the problem you are facing.
> > >
> > >
> >

Thanks
Barry