On 1/4/23 07:56, David Hildenbrand wrote:
On 04.01.23 00:43, Florian Fainelli wrote:
On 10/20/22 14:53, Doug Berger wrote:
MOTIVATION:
Some Broadcom devices (e.g. 7445, 7278) contain multiple memory
controllers with each mapped in a different address range within
a Uniform Memory Architecture. Some users of these systems have
expressed the desire to locate ZONE_MOVABLE memory on each
memory controller to allow user space intensive processing to
make better use of the additional memory bandwidth.
Unfortunately, the historical monotonic layout of zones would
mean that if the lowest addressed memory controller contains
ZONE_MOVABLE memory then all of the memory available from
memory controllers at higher addresses must also be in the
ZONE_MOVABLE zone. This would force all kernel memory accesses
onto the lowest addressed memory controller and significantly
reduce the amount of memory available for non-movable
allocations.
The main objective of this patch set is therefore to allow a
block of memory to be designated as part of the ZONE_MOVABLE
zone where it will always only be used by the kernel page
allocator to satisfy requests for movable pages. The term
Designated Movable Block is introduced here to represent such a
block. The favored implementation allows extension of the
'movablecore' kernel parameter to allow specification of a base
address and support for multiple blocks. The existing
'movablecore' mechanisms are retained.
BACKGROUND:
NUMA architectures support distributing movablecore memory
across each node, but it is undesirable to introduce the
overhead and complexities of NUMA on systems that don't have a
Non-Uniform Memory Architecture.
Commit 342332e6a925 ("mm/page_alloc.c: introduce kernelcore=mirror
option")
also depends on zone overlap to support sytems with multiple
mirrored ranges.
Commit c6f03e2903c9 ("mm, memory_hotplug: remove zone restrictions")
embraced overlapped zones for memory hotplug.
This commit set follows their lead to allow the ZONE_MOVABLE
zone to overlap other zones. Designated Movable Blocks are made
absent from overlapping zones and present within the
ZONE_MOVABLE zone.
I initially investigated an implementation using a Designated
Movable migrate type in line with comments[1] made by Mel Gorman
regarding a "sticky" MIGRATE_MOVABLE type to avoid using
ZONE_MOVABLE. However, this approach was riskier since it was
much more instrusive on the allocation paths. Ultimately, the
progress made by the memory hotplug folks to expand the
ZONE_MOVABLE functionality convinced me to follow this approach.
Mel, David, does the sub-thread discussion with Doug help ensuring that
all of the context is gathered before getting into a more detailed patch
review on a patch-by-patch basis?
Eventually we may need a fairly firm answer as to whether the proposed
approach has any chance of landing upstream in order to either commit to
in subsequent iterations of this patch set, or find an alternative.
As raised, I'd appreciate if less intrusive alternatives could be
evaluated (e.g., fake NUMA nodes and being ablee to just use mbind(),
moving such memory to ZONE_MOVABLE after boot via something like daxctl).
This is not an option with the environment we have to ultimately fit in
which is Android TV utilizing the GKI kernel which does not enable NUMA
and probably never will, and for similar reasons bringing a whole swath
of user-space tools like daxctl may not be practical either, from both a
logistical perspective (simply getting the tools built with bionic,
accepted etc.) as well as system configuration perspective.
I'm not convinced that these intrusive changes are worth it at this
point. Further, some of the assumptions (ZONE_MOVABLE == user space) are
not really future proof as I raised.
I find this patch set reasonably small in contrast to a lot of other mm/
changes, what did you find intrusive specifically?
AFAICT, there only assumption that is being made is that ZONE_MOVABLE
contains memory that can be moved, but even if it did not in the future,
there should hopefully be enough opportunities, given a large enough DMB
region to service the allocation requests of its users. I will go back
and read your comment to make sure I don't misunderstand it.
Thanks
--
Florian