Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08.08.23 08:29, Aneesh Kumar K V wrote:
On 8/8/23 12:05 AM, David Hildenbrand wrote:
On 07.08.23 14:41, David Hildenbrand wrote:
On 07.08.23 14:27, Michal Hocko wrote:
On Sat 05-08-23 19:54:23, Aneesh Kumar K V wrote:
[...]
Do you see a need for firmware-managed memory to be hotplugged in with
different memory block sizes?

In short. Yes. Slightly longer, a fixed size memory block semantic is
just standing in the way and I would even argue it is actively harmful.
Just have a look at ridicously small memory blocks on ppc. I do
understand that it makes some sense to be aligned to the memory model
(so sparsmem section aligned). In an ideal world, memory hotplug v2
interface (if we ever go that path) should be physical memory range based.

Yes, we discussed that a couple of times already (and so far nobody
cared to implement any of that).

Small memory block sizes are very beneficial for use cases like PPC
dlar, virtio-mem, hyperv-balloon, ... essentially in most virtual
environments where you might want to add/remove memory in very small
granularity. I don't see that changing any time soon. Rather the opposite.

Small memory block sizes are suboptimal for large machines where you
might never end up removing such memory (boot memory), or when dealing
with devices that can only be removed in one piece (DIMM/kmem). We
already have memory groups in place to model that.

For the latter it might be beneficial to have memory blocks of larger
size that correspond to the physical memory ranges. That might also make
a memmap (re-)configuration easier.

Not sure if that is standing in any way or is harmful, though.


Just because I thought of something right now, I'll share it, maybe it makes sense.

Assume when we get add_memory*(MHP_MEMMAP_ON_MEMORY) and it is enabled by the admin:

1) We create a single altmap at the beginning of the memory

2) We create the existing fixed-size memory block devices, but flag them
    to be part of a single "altmap" unit.

3) Whenever we trigger offlining of a single such memory block, we
    offline *all* memory blocks belonging to that altmap, essentially
    using a single offline_pages() call and updating all memory block
    states accordingly.

4) Whenever we trigger onlining of a single such memory block, we
    online *all* memory blocks belonging to that altmap, using a single
    online_pages() call.

5) We fail remove_memory() if it doesn't cover the same (altmap) range.

So we can avoid having a memory block v2 (and all that comes with that ...) for now and still get that altmap stuff sorted out. As that altmap behavior can be controlled by the admin, we should be fine for now.

I think all memory notifiers should already be able to handle bigger granularity, but it would be easy to check. Some internal things might require a bit of tweaking.


We can look at the possibility of using the altmap space reserved for a namespace (via option -M dev) for allocating struct page memory even with dax/kmem.

Right, an alternative would also be for the caller to pass in the altmap. Individual memory blocks can then get onlined/offlined as is.

One issue might be, how to get that altmap considered online memory (e.g., pfn_to_online_page(), kdump, ...). Right now, the altmap always falls into an online section once the memory block is online, so pfn_to_online_page() and especially online_section() holds as soon as the altmap has reasonable content -- when the memory block is online and initialized the memmap. Maybe that can be worked around, just pointing it out.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux