On 04/10/2018 08:19, Michal Hocko wrote: > On Wed 03-10-18 19:14:05, David Hildenbrand wrote: >> On 03/10/2018 16:34, Vitaly Kuznetsov wrote: >>> Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> writes: >>> >>>> On 10/03/2018 06:52 AM, Vitaly Kuznetsov wrote: >>>>> It is more than just memmaps (e.g. forking udev process doing memory >>>>> onlining also needs memory) but yes, the main idea is to make the >>>>> onlining synchronous with hotplug. >>>> >>>> That's a good theoretical concern. >>>> >>>> But, is it a problem we need to solve in practice? >>> >>> Yes, unfortunately. It was previously discovered that when we try to >>> hotplug tons of memory to a low memory system (this is a common scenario >>> with VMs) we end up with OOM because for all new memory blocks we need >>> to allocate page tables, struct pages, ... and we need memory to do >>> that. The userspace program doing memory onlining also needs memory to >>> run and in case it prefers to fork to handle hundreds of notfifications >>> ... well, it may get OOMkilled before it manages to online anything. >>> >>> Allocating all kernel objects from the newly hotplugged blocks would >>> definitely help to manage the situation but as I said this won't solve >>> the 'forking udev' problem completely (it will likely remain in >>> 'extreme' cases only. We can probably work around it by onlining with a >>> dedicated process which doesn't do memory allocation). >>> >> >> I guess the problem is even worse. We always have two phases >> >> 1. add memory - requires memory allocation >> 2. online memory - might require memory allocations e.g. for slab/slub >> >> So if we just added memory but don't have sufficient memory to start a >> user space process to trigger onlining, then we most likely also don't >> have sufficient memory to online the memory right away (in some scenarios). >> >> We would have to allocate all new memory for 1 and 2 from the memory to >> be onlined. I guess the latter part is less trivial. >> >> So while onlining the memory from the kernel might make things a little >> more robust, we would still have the chance for OOM / onlining failing. > > Yes, _theoretically_. Is this a practical problem for reasonable > configurations though? I mean, this will never be perfect and we simply > cannot support all possible configurations. We should focus on > reasonable subset of them. From my practical experience the vast > majority of memory is consumed by memmaps (roughly 1.5%). That is not a > lot but I agree that allocating that from the zone normal and off node > is not great. Especially the second part which is noticeable for whole > node hotplug. > > I have a feeling that arguing about fork not able to proceed or OOMing > for the memory hotplug is a bit of a stretch and a sign a of > misconfiguration. > Just to rephrase, I have the same opinion. Something is already messed up if we cannot even fork anymore. We will have OOM already all over the place before/during/after forking. -- Thanks, David / dhildenb