On Tue, 8 Mar 2011 15:49:10 -0800 (PST) David Rientjes <rientjes@xxxxxxxxxx> wrote: > On Tue, 8 Mar 2011, KAMEZAWA Hiroyuki wrote: > > > > That's aside from the general purpose of the new > > > memory.oom_delay_millisecs: users may want a grace period for userspace to > > > increase the hard limit or kill a task before deferring to the kernel. > > > That seems exponentially more useful than simply disabling the oom killer > > > entirely with memory.oom_control. I think it's unfortunate > > > memory.oom_control was merged frst and seems to have tainted this entire > > > discussion. > > > > > > > That sounds like a mis-usage problem....what kind of workaround is offerred > > if the user doesn't configure oom_delay_millisecs , a yet another mis-usage ? > > > > Not exactly sure what you mean, but you're saying disabling the oom killer > with memory.oom_control is not the recommended way to allow userspace to > fix the issue itself? That seems like it's the entire usecase: we'd > rarely want to let a memcg stall when it needs memory without trying to > address the problem (elevating the limit, killing a lower priority job, > sending a signal to free memory). We have a memcg oom notifier to handle > the situation but there's no guarantee that the kernel won't kill > something first and that's a bad result if we chose to address it with one > of the ways mentioned above. > Why memcg's oom and system's oom happens at the same time ? > To answer your question: if the admin doesn't configure a > memory.oom_delay_millisecs, then the oom killer will obviously kill > something off (if memory.oom_control is also not set) when reclaim fails > to free memory just as before. > > Aside from my specific usecase for this tunable, let me pose a question: > do you believe that the memory controller would benefit from allowing > users to have a grace period in which to take one of the actions listed > above instead of killing something itself? Yes, this would be possible by > setting and then unsetting memory.oom_control, but that requires userspace > to always be responsive (which, at our scale, we can unequivocally say > isn't always possible) and doesn't effectively deal with spikes in memory > that may only be temporary and doesn't require any intervention of the > user at all. > Please add 'notifier' in kernel space and handle the event by kernel module. It is much better than 'timeout and allow oom-kill again'. If you add a notifier_chain in memcg's oom path, I have no obstruction. Implementing custom oom handler for it in kernel module sounds better than timeout. If necessary, please export some functionailty of memcg. IIUC, system's oom-killer has notifier chain of oom-kill. There is no reason it's bad to have one for memcg. Isn't it ok ? I think you can do what you want with it. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>