Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 18, 2013 at 05:51:10PM +0100, Michal Hocko wrote:
> On Mon 18-11-13 10:41:15, Johannes Weiner wrote:
> > On Thu, Nov 14, 2013 at 03:26:51PM -0800, David Rientjes wrote:
> > > When current has a pending SIGKILL or is already in the exit path, it
> > > only needs access to memory reserves to fully exit.  In that sense, the
> > > memcg is not actually oom for current, it simply needs to bypass memory
> > > charges to exit and free its memory, which is guarantee itself that
> > > memory will be freed.
> > > 
> > > We only want to notify userspace for actionable oom conditions where
> > > something needs to be done (and all oom handling can already be deferred
> > > to userspace through this method by disabling the memcg oom killer with
> > > memory.oom_control), not simply when a memcg has reached its limit, which
> > > would actually have to happen before memcg reclaim actually frees memory
> > > for charges.
> > 
> > Even though the situation may not require a kill, the user still wants
> > to know that the memory hard limit was breached and the isolation
> > broken in order to prevent a kill.  We just came really close and the
> 
> You can observe that you are getting into troubles from fail counter
> already. The usability without more reclaim statistics is a bit
> questionable but you get a rough impression that something is wrong at
> least.
> 
> > fact that current is exiting is coincidental.  Not everybody is having
> > OOM situations on a frequent basis and they might want to know when
> > they are redlining the system and that the same workload might blow up
> > the next time it's run.
> 
> I am just concerned that signaling temporal OOM conditions which do not
> require any OOM killer action (user or kernel space) might be confusing.
> Userspace would have harder times to tell whether any action is required
> or not.

But userspace in all likeliness DOES need to take action.

Reclaim is a really long process.  If 5 times doing 12 priority cycles
and scanning thousands of pages is not enough to reclaim a single
page, what does that say about the health of the memcg?

But more importantly, OOM handling is just inherently racy.  A task
might receive the kill signal a split second *after* userspace was
notified.  Or a task may exit voluntarily a split second after a
victim was chosen and killed.

We have to draw a line somewhere, right now this is "reclaim failed".
This patch doesn't fix a problem, it just blurs that line and makes
OOM notifications less predictable.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]