On 26.01.21 15:58, Oscar Salvador wrote:
On Tue, Jan 26, 2021 at 10:36:21AM +0100, David Hildenbrand wrote:
I think either keep it completely simple (only free vmemmap of hugetlb
pages allocated early during boot - which is what's not sufficient for
some use cases) or implement the full thing properly (meaning, solve
most challenging issues to get the basics running).
I don't want to have some easy parts of complex features merged (e.g.,
breaking other stuff as you indicate below), and later finding out "it's
not that easy" again and being stuck with it forever.
Well, we could try to do an optimistic allocation, without tricky loopings.
If that fails, refuse to shrink the pool at that moment.
The user could always try to shrink it later via /proc/sys/vm/nr_hugepages
interface.
But I am just thinking out loud..
The real issue seems to be discarding the vmemmap on any memory that has
movability constraints - CMA and ZONE_MOVABLE; otherwise, as discussed,
we can reuse parts of the thingy we're freeing for the vmemmap. Not that
it would be ideal: that once-a-huge-page thing will never ever be a huge
page again - but if it helps with OOM in corner cases, sure.
Possible simplification: don't perform the optimization for now with
free huge pages residing on ZONE_MOVABLE or CMA. Certainly not perfect:
what happens when migrating a huge page from ZONE_NORMAL to
(ZONE_MOVABLE|CMA)?
Of course, this means that e.g: memory-hotplug (hot-remove) will not fully work
when this in place, but well.
Can you elaborate? Are we're talking about having hugepages in
ZONE_MOVABLE that are not migratable (and/or dissolvable) anymore? Than
a clear NACK from my side.
Pretty much, yeah.
Note that we most likely soon have to tackle migrating/dissolving (free)
hugetlbfs pages from alloc_contig_range() context - e.g., for CMA
allocations. That's certainly something to keep in mind regarding any
approaches that already break offline_pages().
--
Thanks,
David / dhildenb