Re: [PATCH v3 2/2] mm: memcg: introduce new event to trace shrink_memcg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 29, 2023 at 06:10:33PM +0100, Michal Hocko wrote:
> On Wed 29-11-23 19:57:52, Dmitry Rokosov wrote:
> > On Wed, Nov 29, 2023 at 05:06:37PM +0100, Michal Hocko wrote:
> > > On Wed 29-11-23 18:20:57, Dmitry Rokosov wrote:
> > > > On Tue, Nov 28, 2023 at 10:32:50AM +0100, Michal Hocko wrote:
> > > > > On Mon 27-11-23 19:16:37, Dmitry Rokosov wrote:
> > > [...]
> > > > > > 2) With this approach, we will not have the ability to trace a situation
> > > > > > where the kernel is requesting reclaim for a specific memcg, but due to
> > > > > > limits issues, we are unable to run it.
> > > > > 
> > > > > I do not follow. Could you be more specific please?
> > > > > 
> > > > 
> > > > I'm referring to a situation where kswapd() or another kernel mm code
> > > > requests some reclaim pages from memcg, but memcg rejects it due to
> > > > limits checkers. This occurs in the shrink_node_memcgs() function.
> > > 
> > > Ohh, you mean reclaim protection
> > > 
> > > > ===
> > > > 		mem_cgroup_calculate_protection(target_memcg, memcg);
> > > > 
> > > > 		if (mem_cgroup_below_min(target_memcg, memcg)) {
> > > > 			/*
> > > > 			 * Hard protection.
> > > > 			 * If there is no reclaimable memory, OOM.
> > > > 			 */
> > > > 			continue;
> > > > 		} else if (mem_cgroup_below_low(target_memcg, memcg)) {
> > > > 			/*
> > > > 			 * Soft protection.
> > > > 			 * Respect the protection only as long as
> > > > 			 * there is an unprotected supply
> > > > 			 * of reclaimable memory from other cgroups.
> > > > 			 */
> > > > 			if (!sc->memcg_low_reclaim) {
> > > > 				sc->memcg_low_skipped = 1;
> > > > 				continue;
> > > > 			}
> > > > 			memcg_memory_event(memcg, MEMCG_LOW);
> > > > 		}
> > > > ===
> > > > 
> > > > With separate shrink begin()/end() tracepoints we can detect such
> > > > problem.
> > > 
> > > How? You are only reporting the number of reclaimed pages and no
> > > reclaimed pages could be not just because of low/min limits but
> > > generally because of other reasons. You would need to report also the
> > > number of scanned/isolated pages.
> > >  
> > 
> > From my perspective, if memory control group (memcg) protection
> > restrictions occur, we can identify them by the absence of the end()
> > pair of begin(). Other reasons will have both tracepoints raised.
> 
> That is not really great way to detect that TBH. Trace events could be
> lost and then you simply do not know what has happened.

I see, thank you very much for the detailed review! I will prepare a new
patchset with memcg names in the lruvec and slab paths, will back soon.

-- 
Thank you,
Dmitry




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux