On Tue, 9 Feb 2010, KAMEZAWA Hiroyuki wrote: > Index: mmotm-2.6.33-Feb06/include/linux/memcontrol.h > =================================================================== > --- mmotm-2.6.33-Feb06.orig/include/linux/memcontrol.h > +++ mmotm-2.6.33-Feb06/include/linux/memcontrol.h > @@ -71,7 +71,8 @@ extern unsigned long mem_cgroup_isolate_ > struct mem_cgroup *mem_cont, > int active, int file); > extern void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask); > -int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem); > +int task_in_oom_mem_cgroup(struct task_struct *task, > + const struct mem_cgroup *mem); This is only called from the oom killer, so I'm not sure this needs to be renamed. It seems like any caller of this function, present or future, would be doing a tasklist iteration while holding a readlock on tasklist_lock, so perhaps just document that task_in_mem_cgroup() requires that? > > extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page); > extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); > @@ -215,7 +216,7 @@ static inline int mm_match_cgroup(struct > return 1; > } > > -static inline int task_in_mem_cgroup(struct task_struct *task, > +static inline int task_in_oom_mem_cgroup(struct task_struct *task, > const struct mem_cgroup *mem) > { > return 1; > Index: mmotm-2.6.33-Feb06/mm/memcontrol.c > =================================================================== > --- mmotm-2.6.33-Feb06.orig/mm/memcontrol.c > +++ mmotm-2.6.33-Feb06/mm/memcontrol.c > @@ -781,16 +781,40 @@ void mem_cgroup_move_lists(struct page * > mem_cgroup_add_lru_list(page, to); > } > > -int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem) > +/* > + * This function is called from OOM Killer. This checks the task is mm_owner > + * and checks it's mem_cgroup is under oom. > + */ > +int task_in_oom_mem_cgroup(struct task_struct *task, > + const struct mem_cgroup *mem) > { > + struct mm_struct *mm; > int ret; > struct mem_cgroup *curr = NULL; > > - task_lock(task); > + /* > + * The task's task->mm pointer is guarded by task_lock() but it's > + * risky to take task_lock in oom kill situaion. Oom-killer may > + * kill a task which is in unknown status and cause siginificant delay > + * or deadlock. > + * So, we use some loose way. Because we're under taslist lock, "task" > + * pointer is always safe and we can access it. So, accessing mem_cgroup > + * via task struct is safe. To check the task is mm owner, we do loose > + * check. And this is enough. > + * There is small race at updating mm->onwer but we can ignore it. > + * A problematic race here means that oom-selection logic by walking > + * task list itself is racy. We can't make any strict guarantee between > + * task's cgroup status and oom-killer selection, anyway. And, in real > + * world, this will be no problem. > + */ > + mm = task->mm; > + if (!mm || mm->owner != task) > + return 0; You can't dereference task->mm->owner without holding task_lock(task), but I don't see why you need to even deal with task->mm. All callers to this function will check for !task->mm either during their iterations or with oom_kill_task() returning 0. > rcu_read_lock(); > - curr = try_get_mem_cgroup_from_mm(task->mm); > + curr = mem_cgroup_from_task(task); > + if (!css_tryget(&curr->css)); > + curr = NULL; We can always dereference p because of tasklist_lock, there should be no need to do rcu_read_lock() or any rcu dereference, so you should be able to just do this: do { curr = mem_cgroup_from_task(task); if (!curr) break; } while (!css_tryget(&curr->css)); If you like that better, I suggest sending your original two-liner fix using task_in_mem_cgroup() while taking task_lock(p) to stable and then improving on it with a follow-up patch for mainline to do this refcount variation. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>