Re: [PATCH] mm,page_alloc: Serialize warn_alloc() if schedulable.

Michal Hocko <mhocko@xxxxxxxx> · Sat, 3 Jun 2017 09:32:21 +0200

On Fri 02-06-17 12:59:44, Andrew Morton wrote:
> On Fri, 2 Jun 2017 09:18:18 +0200 Michal Hocko <mhocko@xxxxxxxx> wrote:
> 
> > On Thu 01-06-17 15:10:22, Andrew Morton wrote:
> > > On Thu, 1 Jun 2017 15:28:08 +0200 Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > 
> > > > On Thu 01-06-17 22:11:13, Tetsuo Handa wrote:
> > > > > Michal Hocko wrote:
> > > > > > On Thu 01-06-17 20:43:47, Tetsuo Handa wrote:
> > > > > > > Cong Wang has reported a lockup when running LTP memcg_stress test [1].
> > > > > >
> > > > > > This seems to be on an old and not pristine kernel. Does it happen also
> > > > > > on the vanilla up-to-date kernel?
> > > > > 
> > > > > 4.9 is not an old kernel! It might be close to the kernel version which
> > > > > enterprise distributions would choose for their next long term supported
> > > > > version.
> > > > > 
> > > > > And please stop saying "can you reproduce your problem with latest
> > > > > linux-next (or at least latest linux)?" Not everybody can use the vanilla
> > > > > up-to-date kernel!
> > > > 
> > > > The changelog mentioned that the source of stalls is not clear so this
> > > > might be out-of-tree patches doing something wrong and dump_stack
> > > > showing up just because it is called often. This wouldn't be the first
> > > > time I have seen something like that. I am not really keen on adding
> > > > heavy lifting for something that is not clearly debugged and based on
> > > > hand waving and speculations.
> > > 
> > > I'm thinking we should serialize warn_alloc anyway, to prevent the
> > > output from concurrent calls getting all jumbled together?
> > 
> > dump_stack already serializes concurrent calls.
> 
> Sure.  But warn_alloc() doesn't.

I really do not see why that would be much better, really. warn_alloc is
more or less one line + dump_stack + warn_alloc_show_mem. Single line
shouldn't be a big deal even though this is a continuation line
actually. dump_stack already contains its own synchronization and the
meminfo stuff is ratelimited to one per second. So why do we exactly
wantt to put yet another lock on top? Just to stick them together? Well
is this worth a new lock dependency between memory allocation and the
whole printk stack or dump_stack? Maybe yes but this needs a much deeper
consideration.

Tetsuo is arguing that the locking will throttle warn_alloc callers and
that can help other processes to move on. I would call it papering over
a real issue which might be somewhere else and that is why I push back so
hard. The initial report is far from complete and seeing 30+ seconds
stalls without any indication that this is just a repeating stall after
10s and 20s suggests that we got stuck somewhere in the reclaim path.

Moreover let's assume that the unfair locking in dump_stack has caused
the stall. How would an warn_alloc lock help when there are other
sources of dump_stack all over the kernel?

Seriously, this whole discussion is based on hand waving. Like for
any other patches, the real issue should be debugged, explained and
discussed based on known facts, not speculations. As things stand now,
my NACK still holds. I am not going to waste my time repeating same
points all over again.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>