On Mon, Jul 29, 2024 at 01:44:34AM -0600, Yu Zhao wrote: > mem_cgroup_calculate_protection() is not stateless and should only be used > as part of a top-down tree traversal. shrink_one() traverses the per-node > memcg LRU instead of the root_mem_cgroup tree, and therefore it should not > call mem_cgroup_calculate_protection(). > > The existing misuse in shrink_one() can cause ineffective protection of > sub-trees that are grandchildren of root_mem_cgroup. Fix it by reusing > lru_gen_age_node(), which already traverses the root_mem_cgroup tree, to > calculate the protection. > > Previously lru_gen_age_node() opportunistically skips the first pass, > i.e., when scan_control->priority is DEF_PRIORITY. On the second pass, > lruvec_is_sizable() uses appropriate scan_control->priority, set by > set_initial_priority() from lru_gen_shrink_node(), to decide whether a > memcg is too small to reclaim from. > > Now lru_gen_age_node() unconditionally traverses the root_mem_cgroup tree. > So it should call set_initial_priority() upfront, to make sure > lruvec_is_sizable() uses appropriate scan_control->priority on the first > pass. Otherwise, lruvec_is_reclaimable() can return false negatives and > result in premature OOM kills when min_ttl_ms is used. > > Link: https://lkml.kernel.org/r/20240712232956.1427127-1-yuzhao@xxxxxxxxxx > Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx> > Reported-by: T.J. Mercier <tjmercier@xxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > (cherry picked from commit 30d77b7eef019fa4422980806e8b7cdc8674493e) > --- > mm/vmscan.c | 83 ++++++++++++++++++++++++----------------------------- > 1 file changed, 38 insertions(+), 45 deletions(-) Now queued up, thanks. greg k-h