Re: [PATCH v2 0/5] Allocate memmap from hotadded memory

Rashmica Gupta <rashmica.g@xxxxxxxxx> · Tue, 30 Jul 2019 17:08:15 +1000



On Mon, 2019-07-29 at 10:06 +0200, David Hildenbrand wrote:
> > > Of course, other interfaces might make sense.
> > > 
> > > You can then start using these memory blocks and hinder them from
> > > getting onlined (as a safety net) via memory notifiers.
> > > 
> > > That would at least avoid you having to call
> > > add_memory/remove_memory/offline_pages/device_online/modifying
> > > memblock
> > > states manually.
> > 
> > I see what you're saying and that definitely sounds safer.
> > 
> > We would still need to call remove_memory and add_memory from
> > memtrace
> > as
> > just offlining memory doesn't remove it from the linear page tables
> > (if 
> > it's still in the page tables then hardware can prefetch it and if
> > hardware tracing is using it then the box checkstops).
> 
> That prefetching part is interesting (and nasty as well). If we could
> at
> least get rid of the manual onlining/offlining, I would be able to
> sleep
> better at night ;) One step at a time.
>

Ok, I'll get to that soon :)

> > > (binding the memory block devices to a driver would be nicer, but
> > > the
> > > infrastructure is not really there yet - we have no such drivers
> > > in
> > > place yet)
> > > 
> > > > I don't know the mm code nor how the notifiers work very well
> > > > so I
> > > > can't quite see how the above would work. I'm assuming memtrace
> > > > would
> > > > register a hotplug notifier and when memory is offlined from
> > > > userspace,
> > > > the callback func in memtrace would be called if the priority
> > > > was
> > > > high
> > > > enough? But how do we know that the memory being offlined is
> > > > intended
> > > > for usto touch? Is there a way to offline memory from userspace
> > > > not
> > > > using sysfs or have I missed something in the sysfs interface?
> > > 
> > > The notifier would really only be used to hinder onlining as a
> > > safety
> > > net. User space prepares (offlines) the memory blocks and then
> > > tells
> > > the
> > > drivers which memory blocks to use.
> > > 
> > > > On a second read, perhaps you are assuming that memtrace is
> > > > used
> > > > after
> > > > adding new memory at runtime? If so, that is not the case. If
> > > > not,
> > > > then
> > > > would you be able to clarify what I'm not seeing?
> > > 
> > > The main problem I see is that you are calling
> > > add_memory/remove_memory() on memory your device driver doesn't
> > > own.
> > > It
> > > could reside on a DIMM if I am not mistaking (or later on
> > > paravirtualized memory devices like virtio-mem if I ever get to
> > > implement them ;) ).
> > 
> > This is just for baremetal/powernv so shouldn't affect virtual
> > memory
> > devices.
> 
> Good to now.
> 
> > > How is it guaranteed that the memory you are allocating does not
> > > reside
> > > on a DIMM for example added via add_memory() by the ACPI driver?
> > 
> > Good point. We don't have ACPI on powernv but currently this would
> > try
> > to remove memory from any online memory node, not just the ones
> > that
> > are backed by RAM. oops.
> 
> Okay, so essentially no memory hotplug/unplug along with memtrace.
> (can
> we document that somewhere?). I think
> add_memory()/try_remove_memory()
> could be tolerable in these environments (as it's only boot memory).
>
Sure thing.