On Thu 13-03-25 16:57:34, Zhongkun He wrote: > On Thu, Mar 13, 2025 at 3:57 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Thu 13-03-25 11:48:12, Zhongkun He wrote: > > > With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to > > > memory.reclaim")', we can submit an additional swappiness=<val> argument > > > to memory.reclaim. It is very useful because we can dynamically adjust > > > the reclamation ratio based on the anonymous folios and file folios of > > > each cgroup. For example,when swappiness is set to 0, we only reclaim > > > from file folios. > > > > > > However,we have also encountered a new issue: when swappiness is set to > > > the MAX_SWAPPINESS, it may still only reclaim file folios. This is due > > > to the knob of cache_trim_mode, which depends solely on the ratio of > > > inactive folios, regardless of whether there are a large number of cold > > > folios in anonymous folio list. > > > > > > So, we hope to add a new control logic where proactive memory reclaim only > > > reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS. > > > For example, something like this: > > > > > > echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim > > > > > > will perform reclaim on the rootcg with a swappiness setting of 200 (max > > > swappiness) regardless of the file folios. Users have a more comprehensive > > > view of the application's memory distribution because there are many > > > metrics available. For example, if we find that a certain cgroup has a > > > large number of inactive anon folios, we can reclaim only those and skip > > > file folios, because with the zram/zswap, the IO tradeoff that > > > cache_trim_mode is making doesn't hold - file refaults will cause IO, > > > whereas anon decompression will not. > > > > > > With this patch, the swappiness argument of memory.reclaim has a more > > > precise semantics: 0 means reclaiming only from file pages, while 200 > > > means reclaiming just from anonymous pages. > > > > Well, with this patch we have 0 - always swap, 200 - never swap and > > anything inbetween behaves more or less arbitrary, right? Not a new > > problem with swappiness but would it make more sense to drop all the > > heuristics for scanning LRUs and simply use the given swappiness when > > doing the pro active reclaim? > > Thanks for your suggestion! I totally agree with you. I'm preparing to send > another patch to do this and a new thread to discuss, because I think the > implementation doesn't conflict with this one. Do you think so ? If the change will enforce SCAN_FRACT for proactive reclaim with swappiness given then it will make the balancing much smoother but I do not think the behavior at both ends of the scale would imply only single LRU scanning mode. -- Michal Hocko SUSE Labs