Re: [PATCH] mm, memcg: show memcg min setting in oom messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 20-11-19 20:23:54, Yafang Shao wrote:
> On Wed, Nov 20, 2019 at 7:40 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > On Wed 20-11-19 18:53:44, Yafang Shao wrote:
> > > On Wed, Nov 20, 2019 at 6:22 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed 20-11-19 03:53:05, Yafang Shao wrote:
> > > > > A task running in a memcg may OOM because of the memory.min settings of his
> > > > > slibing and parent. If this happens, the current oom messages can't show
> > > > > why file page cache can't be reclaimed.
> > > >
> > > > min limit is not the only way to protect memory from being reclaim. The
> > > > memory might be pinned or unreclaimable for other reasons (e.g. swap
> > > > quota exceeded for memcg).
> > >
> > > Both swap or unreclaimabed (unevicteable) is printed in OOM messages.
> >
> > Not really. Consider a memcg which has reached it's swap limit. The
> > anonymous memory is not really reclaimable even when there is a lot of
> > swap space available.
> >
> 
> The memcg swap limit is already printed in oom messages, see bellow,
> 
> [  141.721625] memory: usage 1228800kB, limit 1228800kB, failcnt 18337
> [  141.721958] swap: usage 0kB, limit 9007199254740988kB, failcnt 0

But you do not have any insight on the swap limit down the oom
hierarchy, do you?

> > > Why not just print the memcgs which are under memory.min protection or
> > > something like a total number of min protected memory ?
> >
> > Yes, this would likely help. But the main question really reamains, is
> > this really worth it?
> >
> 
> If it doesn't cost too much, I think it is worth to do it.
> As the oom path is not the critical path, so adding some print info
> should not add much overhead.

Generating a lot of output for the oom reports has been a real problem
in many deployments.
[...]
> > > I have said in the commit log, that we don't know why the file cache
> > > can't be reclaimed (when evictable is 0 and dirty is 0 as well.)
> >
> > And the counter argument is that this will not help you there much in
> > many large and much more common cases.
> >
> > I argue, and I might be wrong here so feel free to correct me, that the
> > reclaim protection guarantee (min) is something to be under admins
> > control. It shouldn't really happen nilly-willy because it has really
> > large consequences, the OOM including. So if there is a suspicious
> > amount of memory that could be reclaimed normally then the reclaim
> > protection is really the first suspect to go after.
> > --
> 
> I don't know whether it happens nilly-willy or not.

It is a reclaim protection guarantee (so essentially an mlock like
thing) so it better have to be properly considered when used.

> But if we all know that it may cause OOMs and it don't take too much
> effort to show it in the OOM messages,

I do not think we are in agreement here. As mentioned above the oom
report is quite heavy already. So it should be other way around. There
should be a strong reason to add something more. A real use case where
not having that information is making debugging ooms considerably much
harder.

-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux