On Thu, Apr 6, 2023 at 1:02 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Wed, Apr 05, 2023 at 03:09:27PM -0600, Yu Zhao wrote: > > On Wed, Apr 5, 2023 at 2:01 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > static bool cgroup_reclaim(struct scan_control *sc) > > > { > > > return sc->target_mem_cgroup; > > > } > > > > > > static bool global_reclaim(struct scan_control *sc) > > > { > > > return !sc->target_mem_cgroup || mem_cgroup_is_root(sc->target_mem_cgroup); > > > } > > > > > > The name suggests it's the same thing twice, with opposite > > > polarity. But of course they're subtly different, and not documented. > > > > > > When do you use which? > > > > The problem I saw is that target_mem_cgroup is set when writing to the > > root memory.reclaim. And for this case, a few places might prefer > > global_reclaim(), e.g., in shrink_lruvec(), in addition to where it's > > being used. > > > > If this makes sense, we could 1) document it (or rename it) and apply > > it to those places, or 2) just unset target_mem_cgroup for root and > > delete global_reclaim(). Option 2 might break ABI but still be > > acceptable. > > Ah, cgroup_reclaim() tests whether it's limit/proactive reclaim or > allocator reclaim. global_reclaim() tests whether it's root reclaim > (which could be from either after memory.reclaim). > > I suppose we didn't clarify when introducing memory.reclaim what the > semantics should be on the root cgroup: > > - We currently exclude PGSCAN and PGSTEAL stats from /proc/vmstat for > limit reclaim to tell cgroup constraints from physical pressure. We > currently exclude root memory.reclaim activity as well. Seems okay. > > - The file_is_tiny heuristic is for allocator reclaim near OOM. It's > currently excluded for root memory.reclaim, which seems okay too. > > - Like limit reclaim, root memory.reclaim currently NEVER swaps when > global swappiness is 0. The whole cgroup-specific swappiness=0 > semantic is kind of quirky. But I suppose we can keep it as-is. > > - Proportional reclaim is disabled for everybody but direct reclaim > from the allocator at initial priority. Effectively disabled for all > situations where it might matter, including root memory.reclaim. We > should also keep this as-is. > > - Writeback throttling is interesting. Historically, kswapd set the > congestion state when running into lots of PG_reclaim pages, and > clear it when the node is balanced. This throttles direct reclaim. > > Cgroup limit reclaim would set and clear congestion on non-root only > to do local limit-reclaim throttling. But now root memory.reclaim > might clear a bit set by kswapd before the node is balanced, and > release direct reclaimers from throttling prematurely. This seems > wrong. However, the alternative is throttling memory.reclaim on > subgroup congestion but not root, which seems also wrong. > > - Root memory.reclaim is exempted from the compaction_ready() bail > condition as well as soft limit reclaim. But they'd both be no-ops > anyway if we changed cgroup_reclaim() semantics. > > IMO we should either figure out what we want to do in those cases > above, at least for writeback throttling. > > Are you guys using root-level proactive reclaim? > > > If we don't want to decide right now, I can rename global_reclaim() to > > root_reclaim() and move it within #ifdef CONFIG_LRU_GEN and probably > > come back and revisit later. > > So conventional vmscan treats root-level memory.reclaim the same as > any other cgroup reclaim. And the cgroup_reclaim() checks are mostly > reserved for (questionable) allocator reclaim-specific heuristics. > > Is there a chance lrugen could do the same, and you'd be fine with > using cgroup_reclaim()? Or is it a data structure problem? > > If so, I agree it could be better to move it to CONFIG_LRU_GEN and > rename it for the time being. root_reclaim() SGTM. Oh and if we decide to keep the helper as root_reclaim I would prefer it to be outside CONFIG_LRU_GEN so that I can still use it in the patch series that this thread was originally a part of (ignoring non-LRU freed pages in memcg reclaim).