On Thu 29-08-19 14:29:22, David Hildenbrand wrote: > On 29.08.19 14:15, Michal Hocko wrote: > > On Thu 29-08-19 14:08:48, David Hildenbrand wrote: > >> On 29.08.19 13:43, David Hildenbrand wrote: > >>> On 29.08.19 13:33, David Hildenbrand wrote: > >>>> On 29.08.19 10:23, Michal Hocko wrote: > >>>>> On Thu 29-08-19 09:00:08, David Hildenbrand wrote: > >>>>>> This is the successor of "[PATCH v2 0/6] mm/memory_hotplug: Consider all > >>>>>> zones when removing memory". I decided to go one step further and finally > >>>>>> factor out the shrinking of zones from memory removal code. Zones are now > >>>>>> fixed up when offlining memory/onlining of memory fails/before removing > >>>>>> ZONE_DEVICE memory. > >>>>> > >>>>> I was about to say Yay! but then reading... > >>>> > >>>> Almost ;) > >>>> > >>>>> > >>>>>> Example: > >>>>>> > >>>>>> :/# cat /proc/zoneinfo > >>>>>> Node 1, zone Movable > >>>>>> spanned 0 > >>>>>> present 0 > >>>>>> managed 0 > >>>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory41/state > >>>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory43/state > >>>>>> :/# cat /proc/zoneinfo > >>>>>> Node 1, zone Movable > >>>>>> spanned 98304 > >>>>>> present 65536 > >>>>>> managed 65536 > >>>>>> :/# echo 0 > /sys/devices/system/memory/memory43/online > >>>>>> :/# cat /proc/zoneinfo > >>>>>> Node 1, zone Movable > >>>>>> spanned 32768 > >>>>>> present 32768 > >>>>>> managed 32768 > >>>>>> :/# echo 0 > /sys/devices/system/memory/memory41/online > >>>>>> :/# cat /proc/zoneinfo > >>>>>> Node 1, zone Movable > >>>>>> spanned 0 > >>>>>> present 0 > >>>>>> managed 0 > >>>>> > >>>>> ... this made me realize that you are trying to fix it instead. Could > >>>>> you explain why do we want to do that? Why don't we simply remove all > >>>>> that crap? Why do we even care about zone boundaries when offlining or > >>>>> removing memory? Zone shrinking was mostly necessary with the previous > >>>>> onlining semantic when the zone type could be only changed on the > >>>>> boundary or unassociated memory. We can interleave memory zones now > >>>>> arbitrarily. > >>>> > >>>> Last time I asked whether we can just drop all that nasty > >>>> zone->contiguous handling I was being told that it does have a > >>>> significant performance impact and is here to stay. The boundaries are a > >>>> key component to detect whether a zone is contiguous. > >>>> > >>>> So yes, while we allow interleaved memory zones, having contiguous zones > >>>> is beneficial for performance. That's why also memory onlining code will > >>>> try to online memory as default to the zone that will keep/make zones > >>>> contiguous. > >>>> > >>>> Anyhow, I think with this series most of the zone shrinking code becomes > >>>> "digestible". Except minor issues with ZONE_DEVICE - which is acceptable. > >>>> > >>> > >>> Also, there are plenty of other users of > >>> node_spanned_pages/zone_spanned_pages etc.. I don't think this can go - > >>> not that easy :) > >>> > >> > >> ... re-reading, your suggestion is to drop the zone _shrinking_ code > >> only, sorry :) That makes more sense. > >> > >> This would mean that once a zone was !contiguous, it will always remain > >> like that. Also, even empty zones after unplug would not result in > >> zone_empty() == true. > > > > exactly. We only need to care about not declaring zone !contigious when > > offlining from ends but that should be trivial. > > That won't help a lot (offlining a DIMM will offline first to last > memory block, so unlikely we can keep the zone !contiguous). However, we > could limit zone shrinking to offlining code only (easy) and not perform > it at all for ZONE_DEVICE memory. That would simplify things *a lot*. > > What's your take? Remove it completely or do it only for !ZONE_DEVICE > memory when offlining/onlining fails? > > I think I would prefer to try to shrink for !ZONE_DEVICE memory, then we > can at least try to keep contiguous set and reset in case it's possible. I would remove that code altogether if that is possible and doesn't introduce any side effects I am not aware right now. All the existing code has to deal with holes already so I do not see any reason why it cannot do the same with holes at both ends. -- Michal Hocko SUSE Labs