On Wed, 23 Feb 2011 16:51:24 -0800 (PST) David Rientjes <rientjes@xxxxxxxxxx> wrote: > > The problem is that the userspace oom handler is also taking peeks into > > processes which are in the stressed memcg and is getting stuck on > > mmap_sem in the procfs reads. Correct? > > > > That's outside the scope of this feature and is a separate discussion; > this patch specifically addresses an issue where a userspace job scheduler > wants to take action when a memcg is oom before deferring to the kernel > and happens to become unresponsive for whatever reason. That's just handwaving used to justify a workaround for a kernel deficiency. If userspace has chosen to repalce the oom-killer then userspace should be able to appropriately perform the role. But for some as-yet-undescribed reason, userspace is *not* able to perform that role. And I'm suspecting that the as-yet-undescribed reason is a kernel deficiency. Spit it out. > > It seems to me that such a userspace oom handler is correctly designed, > > and that we should be looking into the reasons why it is unreliable, > > and fixing them. Please tell us about this? > > > > The problem isn't specific to any one cause or implementation, we know > that userspace programs have bugs, they can stall forever in D-state, they > can be oom themselves, they get stuck waiting on a lock, etc etc. It's not the kernel's role to work around userspace bugs and it's certainly not the kernel's role to work around kernel bugs. Now please tell us: why is the userspace job manager getting stuck? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>