On Wed, Nov 29, 2023 at 06:10:33PM +0100, Michal Hocko wrote: > On Wed 29-11-23 19:57:52, Dmitry Rokosov wrote: > > On Wed, Nov 29, 2023 at 05:06:37PM +0100, Michal Hocko wrote: > > > On Wed 29-11-23 18:20:57, Dmitry Rokosov wrote: > > > > On Tue, Nov 28, 2023 at 10:32:50AM +0100, Michal Hocko wrote: > > > > > On Mon 27-11-23 19:16:37, Dmitry Rokosov wrote: > > > [...] > > > > > > 2) With this approach, we will not have the ability to trace a situation > > > > > > where the kernel is requesting reclaim for a specific memcg, but due to > > > > > > limits issues, we are unable to run it. > > > > > > > > > > I do not follow. Could you be more specific please? > > > > > > > > > > > > > I'm referring to a situation where kswapd() or another kernel mm code > > > > requests some reclaim pages from memcg, but memcg rejects it due to > > > > limits checkers. This occurs in the shrink_node_memcgs() function. > > > > > > Ohh, you mean reclaim protection > > > > > > > === > > > > mem_cgroup_calculate_protection(target_memcg, memcg); > > > > > > > > if (mem_cgroup_below_min(target_memcg, memcg)) { > > > > /* > > > > * Hard protection. > > > > * If there is no reclaimable memory, OOM. > > > > */ > > > > continue; > > > > } else if (mem_cgroup_below_low(target_memcg, memcg)) { > > > > /* > > > > * Soft protection. > > > > * Respect the protection only as long as > > > > * there is an unprotected supply > > > > * of reclaimable memory from other cgroups. > > > > */ > > > > if (!sc->memcg_low_reclaim) { > > > > sc->memcg_low_skipped = 1; > > > > continue; > > > > } > > > > memcg_memory_event(memcg, MEMCG_LOW); > > > > } > > > > === > > > > > > > > With separate shrink begin()/end() tracepoints we can detect such > > > > problem. > > > > > > How? You are only reporting the number of reclaimed pages and no > > > reclaimed pages could be not just because of low/min limits but > > > generally because of other reasons. You would need to report also the > > > number of scanned/isolated pages. > > > > > > > From my perspective, if memory control group (memcg) protection > > restrictions occur, we can identify them by the absence of the end() > > pair of begin(). Other reasons will have both tracepoints raised. > > That is not really great way to detect that TBH. Trace events could be > lost and then you simply do not know what has happened. I see, thank you very much for the detailed review! I will prepare a new patchset with memcg names in the lruvec and slab paths, will back soon. -- Thank you, Dmitry