Re: [LSF/MM/BPF TOPIC] Guaranteed CMA

Alexandru Elisei <alexandru.elisei@xxxxxxx> · Tue, 4 Feb 2025 11:23:20 +0000

Hi,

On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > Hi,
> 
> Hi,
> 
> > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > (GCMA) mechanism that is being used by many Android vendors as an
> > out-of-tree feature, collect input on its possible usefulness for
> > others, feasibility to upstream and suggestions for possible better
> > alternatives.
> > 
> > Problem statement: Some workloads/hardware require physically
> > contiguous memory and carving out reserved memory areas for such
> > allocations often lead to inefficient usage of those carveouts. CMA
> > was designed to solve this inefficiency by allowing movable memory
> > allocations to use this reserved memory when it’s otherwise unused.
> > When a contiguous memory allocation is requested, CMA finds the
> > requested contiguous area, possibly migrating some of the movable
> > pages out of that area.
> > In latency-sensitive use cases, like face unlock on phones, we need to
> > allocate contiguous memory quickly and page migration in CMA takes
> > enough time to cause user-perceptible lag. Such allocations can also
> > fail if page migration is not possible.
> > 
> > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > was not upstreamed but got adopted later by many Android vendors as an
> > out-of-tree feature. It is similar to CMA but backing memory is
> > cleancache backend, containing only clean file-backed pages. Most
> > importantly, the kernel can’t take a reference to pages from the
> > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > when required. This guarantees GCMA low allocation latency and
> > improves allocation success rate.
> > 
> > We would like to standardize GCMA implementation and upstream it since
> > many Android vendors are asking to include it as a generic feature.
> > 
> > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > didn’t know at the time about this use case) might complicate
> > upstreaming.
> 
> we discussed another possible user last year: using MTE tag storage memory
> while the storage is not getting used to store MTE tags [1].
> 
> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> just that it doesn't support MTE itself") for different purposes.
> 
> We need a guarantee that that memory can be freed up / migrated once the tag
> storage gets activated.

If I remember correctly, one of the issues with the MTE project that might be
relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
it for a very long time without specifying FOLL_LONGTERM.

If I remember things correctly, there were two examples given for this; there
might be more, or they might have been eliminated since then:

* The page is used as a buffer for accesses to a file opened with
  O_DIRECT.

* 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
  a direct quote from David [1].

Depending on your usecases, failing the allocation might be acceptable, but for
MTE that wasn't the case.

Hope some of this is useful.

[1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@xxxxxxxxxx/

Thanks,
Alex

> 
> We continued that discussion offline, and two users of such memory we
> discussed would be frontswap, and using it as a memory backend for something
> like swap/zswap: where the pages cannot get pinned / turned unmovable.
> 
> [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@xxxxxxx/
> 
> -- 
> Cheers,
> 
> David / dhildenb
>