On Wed 04-09-24 09:27:40, Davidlohr Bueso wrote: > This adds support for allowing proactive reclaim in general on a > NUMA system. A per-node interface extends support for beyond a > memcg-specific interface, respecting the current semantics of > memory.reclaim: respecting aging LRU and not supporting > artificially triggering eviction on nodes belonging to non-bottom > tiers. > > This patch allows userspace to do: > > echo 512M swappiness=10 > /sys/devices/system/node/nodeX/reclaim > > One of the premises for this is to semantically align as best as > possible with memory.reclaim. During a brief time memcg did > support nodemask until 55ab834a86a9 (Revert "mm: add nodes= > arg to memory.reclaim"), for which semantics around reclaim > (eviction) vs demotion were not clear, rendering charging > expectations to be broken. > > With this approach: > > 1. Users who do not use memcg can benefit from proactive reclaim. It would be great to have some specific examples here. Is there a specific reason memcg is not used? > 2. Proactive reclaim on top tiers will trigger demotion, for which > memory is still byte-addressable. Reclaiming on the bottom nodes > will trigger evicting to swap (the traditional sense of reclaim). > This follows the semantics of what is today part of the aging process > on tiered memory, mirroring what every other form of reclaim does > (reactive and memcg proactive reclaim). Furthermore per-node proactive > reclaim is not as susceptible to the memcg charging problem mentioned > above. > > 3. Unlike memcg, there should be no surprises of callers expecting > reclaim but instead got a demotion. Essentially relying on behavior > of shrink_folio_list() after 6b426d071419 (mm: disable top-tier > fallback to reclaim on proactive reclaim), without the expectations > of try_to_free_mem_cgroup_pages(). I am not sure I understand. If you demote then you effectively reclaim because you free up memory on the specific node. Or do I just misread what you mean? Maybe you meant to say that the overall memory consumption on all nodes is not affected? Your point 4 and 5 follows up on this so we should better clarify that before going there. -- Michal Hocko SUSE Labs