On Fri, 17 Nov 2023 00:08:31 +0100
David Hildenbrand <david@xxxxxxxxxx> wrote:

> On 14.11.23 19:02, Sumanth Korikkar wrote:
> > Hi All,
> > 
> > The patch series implements "memmap on memory" feature on s390 and
> > provides the necessary fixes for it.  
> Thinking about this, one thing that makes s390x different from all the 
> other architectures in this series is the altmap handling.
> I'm curious, why is that even required?
> A memmep that is not marked as online in the section should not be 
> touched by anybody (except memory onlining code :) ). And if we do, it's 
> usually a BUG because that memmap might contain garbage/be poisoned or 
> completely stale, so we might want to track that down and fix it in any 
> case.
> So what speaks against just leaving add_memory() populate the memmap 
> from the altmap? Then, also the page tables for the memmap are already 
> in place when onlining memory.

Good question, I am not 100% sure if we ran into bugs, or simply assumed
that it is not OK to call __add_pages() when the memory for the altmap
is not accessible.

Maybe there is also already a common code bug with that, s390 might be
special but that is often also good for finding bugs in common code ...

> Then, adding two new notifier calls on start of memory_block_online() 
> called something like MEM_PREPARE_ONLINE and end the end of 
> memory_block_offline() called something like MEM_FINISH_OFFLINE is still 
> suboptimal, but that's where standby memory could be 
> activated/deactivated, without messing with the altmap.
> That way, the only s390x specific thing is that the memmap that should 
> not be touched by anybody is actually inaccessible, and you'd 
> activate/deactivate simply from the new notifier calls just the way we 
> used to do.
> It's still all worse than just adding/removing memory properly, using a 
> proper interface -- where you could alloc/free an actual memmap when the 
> altmap is not desired. But I know that people don't want to spend time 
> just doing it cleanly from scratch.

Yes, sometimes they need to be forced to do that :-)

So, we'll look into defining a "proper interface", and treat patches 1-3
separately as bug fixes? Especially patch 3 might be interesting for arm,
if they do not have ZONE_DEVICE, but still use the functions, they might
end up with the no-op version, not really freeing any memory.

