On Thu, Dec 14, 2023 at 09:38:55AM +0100, Michal Hocko wrote: > On Tue 12-12-23 17:38:03, Dan Schatzberg wrote: > > Allow proactive reclaimers to submit an additional swappiness=<val> > > argument to memory.reclaim. This overrides the global or per-memcg > > swappiness setting for that reclaim attempt. > > You are providing the usecase in the cover letter and Andrew usually > appends that to the first patch in the series. I think it would be > better to have the usecase described here. > > [...] > > @@ -1304,6 +1297,18 @@ PAGE_SIZE multiple when read back. > > This means that the networking layer will not adapt based on > > reclaim induced by memory.reclaim. > > > > +The following nested keys are defined. > > + > > + ========== ================================ > > + swappiness Swappiness value to reclaim with > > + ========== ================================ > > + > > + Specifying a swappiness value instructs the kernel to perform > > + the reclaim with that swappiness value. Note that this has the > > + same semantics as the vm.swappiness sysctl - it sets the > > same semantics as vm.swappiness applied to memcg reclaim with all the > existing limitations and potential future extensions. Thanks, will make this change. > > > + relative IO cost of reclaiming anon vs file memory but does > > + not allow for reclaiming specific amounts of anon or file memory. > > + > > memory.peak > > A read-only single value file which exists on non-root > > cgroups. > > The biggest problem with the implementation I can see, and others have > pointed out the same, is how fragile this is. You really have to check > the code and _every_ place which defines scan_control to learn that > mem_cgroup_shrink_node, reclaim_clean_pages_from_list, > reclaim_folio_list, lru_gen_seq_write, try_to_free_pages, balance_pgdat, > shrink_all_memory and __node_reclaim. I have only checked couple of > them, like direct reclaim and kswapd and none of them really sets up > swappiness to the default memcg nor global value. So you effectively end > up with swappiness == 0. > > While the review can point those out it is quite easy to break and you > will only learn about that very indirectly. I think it would be easier > to review and maintain if you go with a pointer that would fallback to > mem_cgroup_swappiness() if NULL which will be the case for every > existing reclaimer except memory.reclaim with swappiness value. I agree. My initial implementation used a pointer for this reason. I'll switch this back. Just to be clear - I still need to initialize scan_control.swappiness in all these other places right? It appears to mostly be stack-initialized which means it has indeterminate value, not necessarily zero.