Re: [PATCH RFC] mm/memory_hotplug: Introduce memory block types

David Hildenbrand <david@xxxxxxxxxx> · Thu, 4 Oct 2018 10:13:48 +0200

On 04/10/2018 08:19, Michal Hocko wrote:
> On Wed 03-10-18 19:14:05, David Hildenbrand wrote:
>> On 03/10/2018 16:34, Vitaly Kuznetsov wrote:
>>> Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> writes:
>>>
>>>> On 10/03/2018 06:52 AM, Vitaly Kuznetsov wrote:
>>>>> It is more than just memmaps (e.g. forking udev process doing memory
>>>>> onlining also needs memory) but yes, the main idea is to make the
>>>>> onlining synchronous with hotplug.
>>>>
>>>> That's a good theoretical concern.
>>>>
>>>> But, is it a problem we need to solve in practice?
>>>
>>> Yes, unfortunately. It was previously discovered that when we try to
>>> hotplug tons of memory to a low memory system (this is a common scenario
>>> with VMs) we end up with OOM because for all new memory blocks we need
>>> to allocate page tables, struct pages, ... and we need memory to do
>>> that. The userspace program doing memory onlining also needs memory to
>>> run and in case it prefers to fork to handle hundreds of notfifications
>>> ... well, it may get OOMkilled before it manages to online anything.
>>>
>>> Allocating all kernel objects from the newly hotplugged blocks would
>>> definitely help to manage the situation but as I said this won't solve
>>> the 'forking udev' problem completely (it will likely remain in
>>> 'extreme' cases only. We can probably work around it by onlining with a
>>> dedicated process which doesn't do memory allocation).
>>>
>>
>> I guess the problem is even worse. We always have two phases
>>
>> 1. add memory - requires memory allocation
>> 2. online memory - might require memory allocations e.g. for slab/slub
>>
>> So if we just added memory but don't have sufficient memory to start a
>> user space process to trigger onlining, then we most likely also don't
>> have sufficient memory to online the memory right away (in some scenarios).
>>
>> We would have to allocate all new memory for 1 and 2 from the memory to
>> be onlined. I guess the latter part is less trivial.
>>
>> So while onlining the memory from the kernel might make things a little
>> more robust, we would still have the chance for OOM / onlining failing.
> 
> Yes, _theoretically_. Is this a practical problem for reasonable
> configurations though? I mean, this will never be perfect and we simply
> cannot support all possible configurations. We should focus on
> reasonable subset of them. From my practical experience the vast
> majority of memory is consumed by memmaps (roughly 1.5%). That is not a
> lot but I agree that allocating that from the zone normal and off node
> is not great. Especially the second part which is noticeable for whole
> node hotplug.
> 
> I have a feeling that arguing about fork not able to proceed or OOMing
> for the memory hotplug is a bit of a stretch and a sign a of
> misconfiguration.
> 

Just to rephrase, I have the same opinion. Something is already messed
up if we cannot even fork anymore. We will have OOM already all over the
place before/during/after forking.

-- 

Thanks,

David / dhildenb