On Fri, Jan 18, 2019 at 7:35 PM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2019/01/19 9:50, Shakeel Butt wrote: > > On looking further it seems like the process selected to be oom-killed > > has exited even before reaching read_lock(&tasklist_lock) in > > oom_kill_process(). More specifically the tsk->usage is 1 which is due > > to get_task_struct() in oom_evaluate_task() and the put_task_struct > > within for_each_thread() frees the tsk and for_each_thread() tries to > > access the tsk. The easiest fix is to do get/put across the > > for_each_thread() on the selected task. > > Good catch. p->usage can become 1 while printk()ing a lot at dump_header(). > > > @@ -981,6 +981,13 @@ static void oom_kill_process(struct oom_control *oc, const char *message) > > * still freeing memory. > > */ > > read_lock(&tasklist_lock); > > + > > + /* > > + * The task 'p' might have already exited before reaching here. The > > + * put_task_struct() will free task_struct 'p' while the loop still try > > + * to access the field of 'p', so, get an extra reference. > > + */ > > + get_task_struct(p); > > for_each_thread(p, t) { > > list_for_each_entry(child, &t->children, sibling) { > > unsigned int child_points; > > @@ -1000,6 +1007,7 @@ static void oom_kill_process(struct oom_control *oc, const char *message) > > } > > } > > } > > + put_task_struct(p); > > Moving put_task_struct(p) to after read_unlock(&tasklist_lock) will reduce > latency of a write_lock(&tasklist_lock) waiter. > > > read_unlock(&tasklist_lock); > > > > /* > > > > By the way, p->usage is already 1 implies that p->mm == NULL due to already > completed exit_mm(p). Then, process_shares_mm(child, p->mm) might fail to > return true for some of children. Not critical but might lead to unnecessary > oom_badness() calls for child selection. Maybe we want to use same logic > __oom_kill_process() uses (i.e. bail out if find_task_lock_mm(p) failed)? Thanks for the review. I am thinking of removing the whole children selection heuristic for now. Shakeel