Re: [LSF/MM/BPF TOPIC] Guaranteed CMA

Suren Baghdasaryan <surenb@xxxxxxxxxx> · Thu, 20 Mar 2025 11:06:22 -0700

On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> <alexandru.elisei@xxxxxxx> wrote:
> >
> > Hi,
> >
> > On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> > > On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > > > Hi,
> > >
> > > Hi,
> > >
> > > > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > > > (GCMA) mechanism that is being used by many Android vendors as an
> > > > out-of-tree feature, collect input on its possible usefulness for
> > > > others, feasibility to upstream and suggestions for possible better
> > > > alternatives.
> > > >
> > > > Problem statement: Some workloads/hardware require physically
> > > > contiguous memory and carving out reserved memory areas for such
> > > > allocations often lead to inefficient usage of those carveouts. CMA
> > > > was designed to solve this inefficiency by allowing movable memory
> > > > allocations to use this reserved memory when it’s otherwise unused.
> > > > When a contiguous memory allocation is requested, CMA finds the
> > > > requested contiguous area, possibly migrating some of the movable
> > > > pages out of that area.
> > > > In latency-sensitive use cases, like face unlock on phones, we need to
> > > > allocate contiguous memory quickly and page migration in CMA takes
> > > > enough time to cause user-perceptible lag. Such allocations can also
> > > > fail if page migration is not possible.
> > > >
> > > > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > > > was not upstreamed but got adopted later by many Android vendors as an
> > > > out-of-tree feature. It is similar to CMA but backing memory is
> > > > cleancache backend, containing only clean file-backed pages. Most
> > > > importantly, the kernel can’t take a reference to pages from the
> > > > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > > > when required. This guarantees GCMA low allocation latency and
> > > > improves allocation success rate.
> > > >
> > > > We would like to standardize GCMA implementation and upstream it since
> > > > many Android vendors are asking to include it as a generic feature.
> > > >
> > > > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > > > didn’t know at the time about this use case) might complicate
> > > > upstreaming.
> > >
> > > we discussed another possible user last year: using MTE tag storage memory
> > > while the storage is not getting used to store MTE tags [1].
> > >
> > > As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> > > not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> > > just that it doesn't support MTE itself") for different purposes.
> > >
> > > We need a guarantee that that memory can be freed up / migrated once the tag
> > > storage gets activated.
> >
> > If I remember correctly, one of the issues with the MTE project that might be
> > relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> > it for a very long time without specifying FOLL_LONGTERM.
> >
> > If I remember things correctly, there were two examples given for this; there
> > might be more, or they might have been eliminated since then:
> >
> > * The page is used as a buffer for accesses to a file opened with
> >   O_DIRECT.
> >
> > * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> >   a direct quote from David [1].
> >
> > Depending on your usecases, failing the allocation might be acceptable, but for
> > MTE that wasn't the case.
> >
> > Hope some of this is useful.
> >
> > [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@xxxxxxxxxx/
>
> Thanks for the references! I'll read through these discussions to see
> how much useful information for GCMA I can extract.

I wanted to get an RFC code ahead of LSF/MM and just finished putting
it together. Sorry for the last minute posting. You can find it here:
https://lore.kernel.org/all/20250320173931.1583800-1-surenb@xxxxxxxxxx/
Thanks,
Suren.

>
> >
> > Thanks,
> > Alex
> >
> > >
> > > We continued that discussion offline, and two users of such memory we
> > > discussed would be frontswap, and using it as a memory backend for something
> > > like swap/zswap: where the pages cannot get pinned / turned unmovable.
> > >
> > > [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@xxxxxxx/
> > >
> > > --
> > > Cheers,
> > >
> > > David / dhildenb
> > >