On Mon, 8 Feb 2010 23:50:12 -0800 (PST) David Rientjes <rientjes@xxxxxxxxxx> wrote: > On Tue, 9 Feb 2010, KAMEZAWA Hiroyuki wrote: > > > Index: mmotm-2.6.33-Feb06/include/linux/memcontrol.h > > =================================================================== > > --- mmotm-2.6.33-Feb06.orig/include/linux/memcontrol.h > > +++ mmotm-2.6.33-Feb06/include/linux/memcontrol.h > > @@ -71,7 +71,8 @@ extern unsigned long mem_cgroup_isolate_ > > struct mem_cgroup *mem_cont, > > int active, int file); > > extern void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask); > > -int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem); > > +int task_in_oom_mem_cgroup(struct task_struct *task, > > + const struct mem_cgroup *mem); > > This is only called from the oom killer, so I'm not sure this needs to > be renamed. Why I renamed this is "be careful when a new user calls this". > It seems like any caller of this function, present or future, > would be doing a tasklist iteration while holding a readlock on > tasklist_lock, so perhaps just document that task_in_mem_cgroup() requires > that? Hmm. ok. I avoid this rename. It will make the patch smaller. > > > > > extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page); > > extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); > > @@ -215,7 +216,7 @@ static inline int mm_match_cgroup(struct > > return 1; > > } > > > > -static inline int task_in_mem_cgroup(struct task_struct *task, > > +static inline int task_in_oom_mem_cgroup(struct task_struct *task, > > const struct mem_cgroup *mem) > > { > > return 1; > > Index: mmotm-2.6.33-Feb06/mm/memcontrol.c > > =================================================================== > > --- mmotm-2.6.33-Feb06.orig/mm/memcontrol.c > > +++ mmotm-2.6.33-Feb06/mm/memcontrol.c > > @@ -781,16 +781,40 @@ void mem_cgroup_move_lists(struct page * > > mem_cgroup_add_lru_list(page, to); > > } > > > > -int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem) > > +/* > > + * This function is called from OOM Killer. This checks the task is mm_owner > > + * and checks it's mem_cgroup is under oom. > > + */ > > +int task_in_oom_mem_cgroup(struct task_struct *task, > > + const struct mem_cgroup *mem) > > { > > + struct mm_struct *mm; > > int ret; > > struct mem_cgroup *curr = NULL; > > > > - task_lock(task); > > + /* > > + * The task's task->mm pointer is guarded by task_lock() but it's > > + * risky to take task_lock in oom kill situaion. Oom-killer may > > + * kill a task which is in unknown status and cause siginificant delay > > + * or deadlock. > > + * So, we use some loose way. Because we're under taslist lock, "task" > > + * pointer is always safe and we can access it. So, accessing mem_cgroup > > + * via task struct is safe. To check the task is mm owner, we do loose > > + * check. And this is enough. > > + * There is small race at updating mm->onwer but we can ignore it. > > + * A problematic race here means that oom-selection logic by walking > > + * task list itself is racy. We can't make any strict guarantee between > > + * task's cgroup status and oom-killer selection, anyway. And, in real > > + * world, this will be no problem. > > + */ > > + mm = task->mm; > > + if (!mm || mm->owner != task) > > + return 0; > > You can't dereference task->mm->owner without holding task_lock(task), but > I don't see why you need to even deal with task->mm. All callers to this > function will check for !task->mm either during their iterations or with > oom_kill_task() returning 0. > Just for being careful. We don't hold task_lock(), which guards task->mm in callers. > > rcu_read_lock(); > > - curr = try_get_mem_cgroup_from_mm(task->mm); > > + curr = mem_cgroup_from_task(task); > > + if (!css_tryget(&curr->css)); > > + curr = NULL; > > We can always dereference p because of tasklist_lock, there should be no > need to do rcu_read_lock() or any rcu dereference, so you should be able > to just do this: > > do { > curr = mem_cgroup_from_task(task); > if (!curr) > break; > } while (!css_tryget(&curr->css)); > Ok, I missed that. thank you. I'll use this code. > If you like that better, I suggest sending your original two-liner fix > using task_in_mem_cgroup() while taking task_lock(p) to stable and then > improving on it with a follow-up patch for mainline to do this refcount > variation. > Hmm. ok. I'll devide the patch into 2 parts. Thank you for review. Regards, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>