On Tue, Oct 30, 2012 at 11:14 PM, Luigi Semenzato <semenzato@xxxxxxxxxx> wrote: > On Tue, Oct 30, 2012 at 9:46 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote: >> On Tue, 30 Oct 2012, Luigi Semenzato wrote: >> >>> Actually, there is a very simple fix: >>> >>> @@ -355,14 +364,6 @@ static struct task_struct >>> *select_bad_process(unsigned int *ppoints, >>> if (p == current) { >>> chosen = p; >>> *ppoints = 1000; >>> - } else if (!force_kill) { >>> - /* >>> - * If this task is not being ptraced on exit, >>> - * then wait for it to finish before killing >>> - * some other task unnecessarily. >>> - */ >>> - if (!(p->group_leader->ptrace & PT_TRACE_EXIT)) >>> - return ERR_PTR(-1UL); >>> } >>> } >>> >>> I'd rather kill some other task unnecessarily than hang! My load >>> works fine with this change. >>> >> >> That's not an acceptable "fix" at all, it will lead to unnecessarily >> killing processes when others are in the exit path, i.e. every oom kill >> would kill two or three or more processes instead of just one. > > I am sorry, I didn't mean to suggest that this is the right fix for > everybody. It seems to work for us. A real fix would be much harder, > I think. Certainly it would be for me. > > We don't rely on OOM-killing for memory management (we tried to, but > it has drawbacks). But OOM kills can still happen, so we have to deal > with them. We can deal with multiple processes being killed, but not > with a hang. I might be tempted to say that this should be true for > everybody, but I can imagine systems that work by allowing only one > process to die, and perhaps the load on those systems is such that > they don't experience this deadlock often, or ever (even though I > would be nervous about it). To make it clear, I am suggesting that this "fix" might work as a temporary workaround until a better fix is available. >> Could you please try this on 3.6 since all the code you're quoting is from >> old kernels? > > I will see if I can do it, but we're shipping 3.4 and I am not sure > about the status of our 3.6 tree. I will also visually inspect the > relevant 3.6 code and see if the possibility of deadlock is still > there. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>