Re: [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29.08.19 14:15, Michal Hocko wrote:
> On Thu 29-08-19 14:08:48, David Hildenbrand wrote:
>> On 29.08.19 13:43, David Hildenbrand wrote:
>>> On 29.08.19 13:33, David Hildenbrand wrote:
>>>> On 29.08.19 10:23, Michal Hocko wrote:
>>>>> On Thu 29-08-19 09:00:08, David Hildenbrand wrote:
>>>>>> This is the successor of "[PATCH v2 0/6] mm/memory_hotplug: Consider all
>>>>>> zones when removing memory". I decided to go one step further and finally
>>>>>> factor out the shrinking of zones from memory removal code. Zones are now
>>>>>> fixed up when offlining memory/onlining of memory fails/before removing
>>>>>> ZONE_DEVICE memory.
>>>>>
>>>>> I was about to say Yay! but then reading...
>>>>
>>>> Almost ;)
>>>>
>>>>>
>>>>>> Example:
>>>>>>
>>>>>> :/# cat /proc/zoneinfo
>>>>>> Node 1, zone  Movable
>>>>>>         spanned  0
>>>>>>         present  0
>>>>>>         managed  0
>>>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory41/state 
>>>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory43/state
>>>>>> :/# cat /proc/zoneinfo
>>>>>> Node 1, zone  Movable
>>>>>>         spanned  98304
>>>>>>         present  65536
>>>>>>         managed  65536
>>>>>> :/# echo 0 > /sys/devices/system/memory/memory43/online
>>>>>> :/# cat /proc/zoneinfo
>>>>>> Node 1, zone  Movable
>>>>>>         spanned  32768
>>>>>>         present  32768
>>>>>>         managed  32768
>>>>>> :/# echo 0 > /sys/devices/system/memory/memory41/online
>>>>>> :/# cat /proc/zoneinfo
>>>>>> Node 1, zone  Movable
>>>>>>         spanned  0
>>>>>>         present  0
>>>>>>         managed  0
>>>>>
>>>>> ... this made me realize that you are trying to fix it instead. Could
>>>>> you explain why do we want to do that? Why don't we simply remove all
>>>>> that crap? Why do we even care about zone boundaries when offlining or
>>>>> removing memory? Zone shrinking was mostly necessary with the previous
>>>>> onlining semantic when the zone type could be only changed on the
>>>>> boundary or unassociated memory. We can interleave memory zones now
>>>>> arbitrarily.
>>>>
>>>> Last time I asked whether we can just drop all that nasty
>>>> zone->contiguous handling I was being told that it does have a
>>>> significant performance impact and is here to stay. The boundaries are a
>>>> key component to detect whether a zone is contiguous.
>>>>
>>>> So yes, while we allow interleaved memory zones, having contiguous zones
>>>> is beneficial for performance. That's why also memory onlining code will
>>>> try to online memory as default to the zone that will keep/make zones
>>>> contiguous.
>>>>
>>>> Anyhow, I think with this series most of the zone shrinking code becomes
>>>> "digestible". Except minor issues with ZONE_DEVICE - which is acceptable.
>>>>
>>>
>>> Also, there are plenty of other users of
>>> node_spanned_pages/zone_spanned_pages etc.. I don't think this can go -
>>> not that easy :)
>>>
>>
>> ... re-reading, your suggestion is to drop the zone _shrinking_ code
>> only, sorry :) That makes more sense.
>>
>> This would mean that once a zone was !contiguous, it will always remain
>> like that. Also, even empty zones after unplug would not result in
>> zone_empty() == true.
> 
> exactly. We only need to care about not declaring zone !contigious when
> offlining from ends but that should be trivial.

That won't help a lot (offlining a DIMM will offline first to last
memory block, so unlikely we can keep the zone !contiguous). However, we
could limit zone shrinking to offlining code only (easy) and not perform
it at all for ZONE_DEVICE memory. That would simplify things *a lot*.

What's your take? Remove it completely or do it only for !ZONE_DEVICE
memory when offlining/onlining fails?

I think I would prefer to try to shrink for !ZONE_DEVICE memory, then we
can at least try to keep contiguous set and reset in case it's possible.

> 
>> I can see that some users of *_spanned_pages make certain assumptions
>> based on the size (snapshot, oom killer, ...), but that would already be
>> wrong in case the zone is very sparse.
> 
> at least oom killer usage is certainly wrong. I will have a look.
> 
>> I'll prepare something, then we can discuss.
> 
> Thanks!
> 


-- 

Thanks,

David / dhildenb




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux