Re: [PATCH] mm:memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create

Minchan Kim <minchan@xxxxxxxxxx> · Fri, 20 Apr 2018 14:42:39 +0900

On Thu, Apr 19, 2018 at 08:40:05AM +0200, Michal Hocko wrote:
> On Wed 18-04-18 11:58:00, David Rientjes wrote:
> > On Wed, 18 Apr 2018, Michal Hocko wrote:
> > 
> > > > Okay, no problem. However, I don't feel we need ratelimit at this moment.
> > > > We can do when we got real report. Let's add just one line warning.
> > > > However, I have no talent to write a poem to express with one line.
> > > > Could you help me?
> > > 
> > > What about
> > > 	pr_info("Failed to create memcg slab cache. Report if you see floods of these\n");
> > >  

Thanks you, Michal. However, hmm, floods is very vague to me. 100 time per sec?
10 time per hour? I guess we need more guide line to trigger user's reporting
if we really want to do.

> > 
> > Um, there's nothing actionable here for the user.  Even if the message 
> > directed them to a specific email address, what would you ask the user for 
> > in response if they show a kernel log with 100 of these?
> 
> We would have to think of a better way to create shaddow memcg caches.
> 
> > Probably ask 
> > them to use sysrq at the time it happens to get meminfo.  But any user 
> > initiated sysrq is going to reveal very different state of memory compared 
> > to when the kmalloc() actually failed.
> 
> Not really.
> 
> > If this really needs a warning, I think it only needs to be done once and 
> > reveal the state of memory similar to how slub emits oom warnings.  But as 
> > the changelog indicates, the system is oom and we couldn't reclaim.  We 
> > can expect this happens a lot on systems with memory pressure.  What is 
> > the warning revealing that would be actionable?
> 
> That it actually happens in real workloads and we want to know what
> those workloads are. This code is quite old and yet this is the first
> some somebody complains. So it is most probably rare. Maybe because most
> workloads doesn't create many memcgs dynamically while low on memory.
> And maybe that will change in future. In any case, having a large splat
> of meminfo for GFP_NOWAIT is not really helpful. It will tell us what we
> know already - the memory is low and the reclaim was prohibited. We just
> need to know that this happens out there.

The workload was experimenting creating memcg per app on embedded device
but at this moment, I don't consider kmemcg at this moment so I can live
with disabling kmemcg, even. Based on it, I cannot say whether it's real
workload or not.

When I see replies of this thread, it's arguble to add such one-line
warn so if you want it strongly, could you handle by yourself?
Sorry but I don't have any interest on the arguing.

Thanks.