On 01/18/2013 08:14 PM, Robin Holt wrote: > On Fri, Jan 18, 2013 at 11:04:10AM +0800, Xiao Guangrong wrote: >> On 01/18/2013 10:48 AM, Robin Holt wrote: >>> On Fri, Jan 18, 2013 at 10:42:07AM +0800, Xiao Guangrong wrote: >>>> On 01/17/2013 09:45 PM, Robin Holt wrote: >>>>> On Thu, Jan 17, 2013 at 08:19:55PM +0800, Xiao Guangrong wrote: >>>>>> On 01/17/2013 07:12 PM, Robin Holt wrote: >>>>>>> On Thu, Jan 17, 2013 at 10:45:32AM +0800, Xiao Guangrong wrote: >>>>>>>> On 01/17/2013 05:01 AM, Robin Holt wrote: >>>>>>>>> >>>>>>>>> There is a race condition between mmu_notifier_unregister() and >>>>>>>>> __mmu_notifier_release(). >>>>>>>>> >>>>>>>>> Assume two tasks, one calling mmu_notifier_unregister() as a result >>>>>>>>> of a filp_close() ->flush() callout (task A), and the other calling >>>>>>>>> mmu_notifier_release() from an mmput() (task B). >>>>>>>>> >>>>>>>>> A B >>>>>>>>> t1 srcu_read_lock() >>>>>>>>> t2 if (!hlist_unhashed()) >>>>>>>>> t3 srcu_read_unlock() >>>>>>>>> t4 srcu_read_lock() >>>>>>>>> t5 hlist_del_init_rcu() >>>>>>>>> t6 synchronize_srcu() >>>>>>>>> t7 srcu_read_unlock() >>>>>>>>> t8 hlist_del_rcu() <--- NULL pointer deref. >>>>>>>> >>>>>>>> The detailed code here is: >>>>>>>> hlist_del_rcu(&mn->hlist); >>>>>>>> >>>>>>>> Can mn be NULL? I do not think so since mn is always the embedded struct >>>>>>>> of the caller, it be freed after calling mmu_notifier_unregister. >>>>>>> >>>>>>> If you look at __mmu_notifier_release() it is using hlist_del_init_rcu() >>>>>>> which will set the hlist->pprev to NULL. When hlist_del_rcu() is called, >>>>>>> it attempts to update *hlist->pprev = hlist->next and that is where it >>>>>>> takes the NULL pointer deref. >>>>>> >>>>>> Yes, sorry for my careless. So, That can not be fixed by using >>>>>> hlist_del_init_rcu instead? >>>>> >>>>> The problem is the race described above. Thread 'A' has checked to see >>>>> if n->pprev != NULL. Based upon that, it did called the mn->release() >>>>> method. While it was trying to call the release method, thread 'B' ended >>>>> up calling hlist_del_init_rcu() which set n->pprev = NULL. Then thread >>>>> 'A' got to run again and now it tries to do the hlist_del_rcu() which, as >>>>> part of __hlist_del(), the pprev will be set to n->pprev (which is NULL) >>>>> and then *pprev = n->next; hits the NULL pointer deref hits. >>>> >>>> I mean using hlist_del_init_rcu instead of hlist_del_rcu in >>>> mmu_notifier_unregister(), hlist_del_init_rcu is aware of ->pprev. >>> >>> How does that address the calling of the ->release() method twice? >> >> Hmm, what is the problem of it? If it is just for "performance issue", i think >> it is not worth introducing so complex lock rule just for the really rare case. > > Complex lock rule? We merely moved the lock up earlier in code path. > Without this, we have some cases where you get called on ->release() > twice, while the majority of cases your notifier gets called once and > it hits a NULL pointer deref at that. What is so complex about that? Aha, if we use hlist_del_init_rcu() instead of hlist_del_rcu, can the NULL deref bug be fixed? - If yes, you'd better make it as a simple patch, it is good for backport. Then make the second patch to fix the "problem" of calling ->release twice. - if no. Could you please detail the changelog. From the changelog, i only see the bug is cased by calling hlist_del_rcu on the unhashed node. Thank you! :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>