On Tue, 9 Feb 2010, KAMEZAWA Hiroyuki wrote: > > > > > - task_lock(task); > > > > > + /* > > > > > + * The task's task->mm pointer is guarded by task_lock() but it's > > > > > + * risky to take task_lock in oom kill situaion. Oom-killer may > > > > > + * kill a task which is in unknown status and cause siginificant delay > > > > > + * or deadlock. > > > > > + * So, we use some loose way. Because we're under taslist lock, "task" > > > > > + * pointer is always safe and we can access it. So, accessing mem_cgroup > > > > > + * via task struct is safe. To check the task is mm owner, we do loose > > > > > + * check. And this is enough. > > > > > + * There is small race at updating mm->onwer but we can ignore it. > > > > > + * A problematic race here means that oom-selection logic by walking > > > > > + * task list itself is racy. We can't make any strict guarantee between > > > > > + * task's cgroup status and oom-killer selection, anyway. And, in real > > > > > + * world, this will be no problem. > > > > > + */ > > > > > + mm = task->mm; > > > > > + if (!mm || mm->owner != task) > > > > > + return 0; > > > > > > > > You can't dereference task->mm->owner without holding task_lock(task), but > > > > I don't see why you need to even deal with task->mm. All callers to this > > > > function will check for !task->mm either during their iterations or with > > > > oom_kill_task() returning 0. > > > > > > > Just for being careful. We don't hold task_lock(), which guards task->mm in > > > callers. > > > > > > > The callers don't care if it disappears out from under us since we never > > dereference it, it's just a sanity check to ensure we don't pick a > > kthread or an exiting task that won't free any memory. > > But we need the guarantee that it's safe to access mm->owner in this code. > It's possible task->mm is set to be NULL while we come here. > Hmm. taking task_lock() is better, finally ? > That was my original point when I said you can't dereference task->mm->owner without task_lock(task), but I don't see why you need that check to begin with. > But I don't like taking such a lock here to do easy checks.. > *maybe* I'll postpone this updates and just post original fix again. > Feel free to add my Acked-by: David Rientjes <rientjes@xxxxxxxxxx> since it's a much-needed fix for memcg both in mainline and in -stable. > There are task_lock() and task_unlock() but task_trylock() is not implemented. > I think I shouldn't add a new trylock. task_trylock() isn't appropriate for this usecase because it would exclude tasks from the iteration in select_bad_process() if its contended, i.e. we could panic the machine unnecessary simply because the lock is taken. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>