> On Tue, 8 Jun 2010, KOSAKI Motohiro wrote: > > > > @@ -267,6 +259,8 @@ static struct task_struct *select_bad_process(unsigned long *ppoints, > > > continue; > > > if (mem && !task_in_mem_cgroup(p, mem)) > > > continue; > > > + if (!has_intersects_mems_allowed(p)) > > > + continue; > > > > > > /* > > > * This task already has access to memory reserves and is > > > > now we have three places of oom filtering > > (1) select_bad_process > > Done. > > > (2) dump_tasks > > dump_tasks() has never filtered on this, it's possible for tasks is other > cpusets to allocate memory on our nodes. I have no objection because it's policy matter. but if so, dump_tasks() should display mem_allowed mask too, probably. otherwise, end-user can't understand why badness but not mem intersected task didn't killed. > > (3) oom_kill_task (when oom_kill_allocating_task==1 only) > > > > Why would care about cpuset attachment in oom_kill_task()? You mean > oom_kill_process() to filter the children list? Ah, intersting question. OK, we have to discuss oom_kill_allocating_task design at first. First of All, oom_kill_process() to filter the children list and this issue are independent and unrelated. My patch was not correct too. Now, oom_kill_allocating_task basic logic is here. It mean, if oom_kill_process() return 0, oom kill finished successfully. but if oom_kill_process() return 1, fallback to normall __out_of_memory(). =================================================== static void __out_of_memory(gfp_t gfp_mask, int order, nodemask_t *nodemask) { struct task_struct *p; unsigned long points; if (sysctl_oom_kill_allocating_task) if (!oom_kill_process(current, gfp_mask, order, 0, NULL, nodemask, "Out of memory (oom_kill_allocating_task)")) return; retry: When oom_kill_process() return 1? I think It should be - current is OOM_DISABLE - current have no intersected CPUSET - current is KTHREAD - etc etc.. It mean, consist rule of !oom_kill_allocating_task case. So, my previous patch didn't care to conflict "oom: sacrifice child with highest badness score for parent" patch. Probably right way is static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order, unsigned long points, struct mem_cgroup *mem, nodemask_t *nodemask, const char *message) { struct task_struct *c; struct task_struct *t = p; struct task_struct *victim = p; unsigned long victim_points = 0; struct timespec uptime; + /* This process is not oom killable, we need to retry to select + bad process */ + if (oom_unkillable(c, mem, nodemask)) + return 1; if (printk_ratelimit()) dump_header(p, gfp_mask, order, mem, nodemask); pr_err("%s: Kill process %d (%s) with score %lu or sacrifice child\n", message, task_pid_nr(p), p->comm, points); or something else. What do you think? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>