On 2019/3/17 3:42, Andrea Arcangeli wrote: > On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote: >> On 2019/3/16 5:39, Andrea Arcangeli wrote: >>> On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote: >>>> I can reproduce the issue in arm64 qemu machine. The issue will leave after applying the >>>> patch. >>>> >>>> Tested-by: zhong jiang <zhongjiang@xxxxxxxxxx> >>> Thanks a lot for the quick testing! >>> >>>> Meanwhile, I just has a little doubt whether it is necessary to use RCU to free the task struct or not. >>>> I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner. >>> I wish it was enough, but the problem is that the other CPU may be in >>> the middle of get_mem_cgroup_from_mm() while this runs, and it would >>> dereference mm->owner while it is been freed without the call_rcu >>> affter we clear mm->owner. What prevents this race is the >> As you had said, It would dereference mm->owner after we clear mm->owner. >> >> But after we clear mm->owner, mm->owner should be NULL. Is it right? >> >> And mem_cgroup_from_task will check the parameter. >> you mean that it is possible after checking the parameter to clear the owner . >> and the NULL pointer will trigger. :-( > Dereference mm->owner didn't mean reading the value of the mm->owner > pointer, it really means to dereference the value of the pointer. It's > like below: > > get_mem_cgroup_from_mm() failing fork() > ---- --- > task = mm->owner > mm->owner = NULL; > free(mm->owner) > *task /* use after free */ > > We didn't set mm->owner to NULL before, so the window for the race was > larger, but setting mm->owner to NULL only hides the problem and it > can still happen (albeit with a smaller window). > > If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL, > then the free of the task struct must be delayed until after > rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is > the standard RCU model, the freeing must be delayed until after the > next quiescent point. Thank you for your explaination patiently. The patch should go to upstream too. I think you should send a formal patch to the mainline. Maybe other people suffer from the issue. :-) Thanks, zhong jiang > BTW, both mm_update_next_owner() and mm_clear_owner() should have used > WRITE_ONCE when they write to mm->owner, I can update that too but > it's just to not to make assumptions that gcc does the right thing > (and we still rely on gcc to do the right thing in other places) so > that is just an orthogonal cleanup. > > Thanks, > Andrea > > . >