On Fri, Jan 18, 2013 at 11:04:10AM +0800, Xiao Guangrong wrote: > On 01/18/2013 10:48 AM, Robin Holt wrote: > > On Fri, Jan 18, 2013 at 10:42:07AM +0800, Xiao Guangrong wrote: > >> On 01/17/2013 09:45 PM, Robin Holt wrote: > >>> On Thu, Jan 17, 2013 at 08:19:55PM +0800, Xiao Guangrong wrote: > >>>> On 01/17/2013 07:12 PM, Robin Holt wrote: > >>>>> On Thu, Jan 17, 2013 at 10:45:32AM +0800, Xiao Guangrong wrote: > >>>>>> On 01/17/2013 05:01 AM, Robin Holt wrote: > >>>>>>> > >>>>>>> There is a race condition between mmu_notifier_unregister() and > >>>>>>> __mmu_notifier_release(). > >>>>>>> > >>>>>>> Assume two tasks, one calling mmu_notifier_unregister() as a result > >>>>>>> of a filp_close() ->flush() callout (task A), and the other calling > >>>>>>> mmu_notifier_release() from an mmput() (task B). > >>>>>>> > >>>>>>> A B > >>>>>>> t1 srcu_read_lock() > >>>>>>> t2 if (!hlist_unhashed()) > >>>>>>> t3 srcu_read_unlock() > >>>>>>> t4 srcu_read_lock() > >>>>>>> t5 hlist_del_init_rcu() > >>>>>>> t6 synchronize_srcu() > >>>>>>> t7 srcu_read_unlock() > >>>>>>> t8 hlist_del_rcu() <--- NULL pointer deref. > >>>>>> > >>>>>> The detailed code here is: > >>>>>> hlist_del_rcu(&mn->hlist); > >>>>>> > >>>>>> Can mn be NULL? I do not think so since mn is always the embedded struct > >>>>>> of the caller, it be freed after calling mmu_notifier_unregister. > >>>>> > >>>>> If you look at __mmu_notifier_release() it is using hlist_del_init_rcu() > >>>>> which will set the hlist->pprev to NULL. When hlist_del_rcu() is called, > >>>>> it attempts to update *hlist->pprev = hlist->next and that is where it > >>>>> takes the NULL pointer deref. > >>>> > >>>> Yes, sorry for my careless. So, That can not be fixed by using > >>>> hlist_del_init_rcu instead? > >>> > >>> The problem is the race described above. Thread 'A' has checked to see > >>> if n->pprev != NULL. Based upon that, it did called the mn->release() > >>> method. While it was trying to call the release method, thread 'B' ended > >>> up calling hlist_del_init_rcu() which set n->pprev = NULL. Then thread > >>> 'A' got to run again and now it tries to do the hlist_del_rcu() which, as > >>> part of __hlist_del(), the pprev will be set to n->pprev (which is NULL) > >>> and then *pprev = n->next; hits the NULL pointer deref hits. > >> > >> I mean using hlist_del_init_rcu instead of hlist_del_rcu in > >> mmu_notifier_unregister(), hlist_del_init_rcu is aware of ->pprev. > > > > How does that address the calling of the ->release() method twice? > > Hmm, what is the problem of it? If it is just for "performance issue", i think > it is not worth introducing so complex lock rule just for the really rare case. Complex lock rule? We merely moved the lock up earlier in code path. Without this, we have some cases where you get called on ->release() twice, while the majority of cases your notifier gets called once and it hits a NULL pointer deref at that. What is so complex about that? I originally was going to change both the __mmu_notifier_release() function and the mmu_notifier_unregister() function to make the sequence, lock, unlink, unlock, callout, but I thought that, although being more correct, would get push back despite the fact that the lock is structure local and likely to only be contended from two threads at the same time. Thanks, Robin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>