On Tue, 21 Dec 2010, Andrew Morton wrote: > > Completely disabling the oom killer for a memcg is problematic if > > userspace is unable to address the condition itself, usually because > > userspace is unresponsive. This scenario creates a memcg livelock: > > tasks are continuously trying to allocate memory and nothing is getting > > killed, so memory freeing is impossible since reclaim has failed, and > > all work stalls with no remedy in sight. > > Userspace was buggy, surely. If userspace has elected to disable the > oom-killer then it should ensure that it can cope with the ensuing result. > I think it would be argued that no such guarantee can ever be made. > One approach might be to run a mlockall()ed watchdog which monitors the > worker tasks via shared memory. Another approach would be to run that > watchdog in a different memcg, without mlockall(). There are surely > plenty of other ways of doing it. > Yeah, we considered a simple and perfect userspace implementation that would be as fault tolerant unless it ends up getting killed (not by the oom killer) or dies itself, but there was a concern that setting every memcg to have oom_control of 0 could render the entire kernel useless without the help of userspace and that is a bad policy. In our particular use case, we _always_ want to defer using the kernel oom killer unless userspace chooses not to act (because the limit is already high enough) or cannot act (because of a bug). The former is accomplished by setting memory.oom_control to 0 originally and then setting it to 1 for that particular memcg to allow the oom kill, but it is not possible for the latter. > Minutea: > > - changelog and docs forgot to mention that oom_delay=0 disables. > I thought it would be intuitive that an oom_delay of 0 would mean there was no delay :) > - it's called oom_kill_delay in the kernel and oom_delay in userspace. > Right, this was because of the symmetry to the oom_kill_disable naming in the struct itself. I'd be happy to change it if we're to go ahead in this direction. > - oom_delay_millisecs would be a better name for the pseudo file. > Agreed. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>