On Wed, 23 Feb 2011 15:08:50 -0800 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > On Wed, 9 Feb 2011 14:19:50 -0800 (PST) > David Rientjes <rientjes@xxxxxxxxxx> wrote: > > > Completely disabling the oom killer for a memcg is problematic if > > userspace is unable to address the condition itself, usually because it > > is unresponsive. This scenario creates a memcg deadlock: tasks are > > sitting in TASK_KILLABLE waiting for the limit to be increased, a task to > > exit or move, or the oom killer reenabled and userspace is unable to do > > so. > > > > An additional possible use case is to defer oom killing within a memcg > > for a set period of time, probably to prevent unnecessary kills due to > > temporary memory spikes, before allowing the kernel to handle the > > condition. > > > > This patch adds an oom killer delay so that a memcg may be configured to > > wait at least a pre-defined number of milliseconds before calling the oom > > killer. If the oom condition persists for this number of milliseconds, > > the oom killer will be called the next time the memory controller > > attempts to charge a page (and memory.oom_control is set to 0). This > > allows userspace to have a short period of time to respond to the > > condition before deferring to the kernel to kill a task. > > > > Admins may set the oom killer delay using the new interface: > > > > # echo 60000 > memory.oom_delay_millisecs > > > > This will defer oom killing to the kernel only after 60 seconds has > > elapsed by putting the task to sleep for 60 seconds. When setting > > memory.oom_delay_millisecs, all pending delays have their charges retried > > and, if necessary, the new delay is then enforced. > > > > The delay is cleared the first time the memcg is oom to avoid unnecessary > > waiting when userspace is unresponsive for future oom conditions. It may > > be set again using the above interface to enforce a delay on the next > > oom. > > > > When a memory.oom_delay_millisecs is set for a cgroup, it is propagated > > to all children memcg as well and is inherited when a new memcg is > > created. > > Your patch still stinks! > > If userspace can't handle a disabled oom-killer then userspace > shouldn't have disabled the oom-killer. > > How do we fix this properly? > > A little birdie tells me that the offending userspace oom handler is > running in a separate memcg and is not itself running out of memory. > The problem is that the userspace oom handler is also taking peeks into > processes which are in the stressed memcg and is getting stuck on > mmap_sem in the procfs reads. Correct? > Hmm, I think memcg's oom-kill just happens under down_read(mmap_sem). And all tasks, which is under oom, will be in wait-queue. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>