On Fri, Mar 15, 2019 at 01:50:21AM +0000, Al Viro wrote: > > If it fails, we call __lock_parent(). Which > * grabs RCU lock > * drops ->d_lock (now we are not holding ->d_lock > on anything). > * fetches ->d_parent. Note the READ_ONCE() there - > it's *NOT* stable (no ->d_lock held). We can't expect > that ->d_parent won't change or that the reference it used > to contribute to parent's refcount is there anymore; as > the matter of fact, the only thing that prevents outright > _freeing_ of the object 'parent' points to is rcu_read_lock() > and RCU delay between dropping the last reference and > actual freeing of the sucker. rcu_read_lock() is there, > though, which makes it safe to grab ->d_lock on 'parent'. > > That 'parent' might very well have nothing to do with our > dentry by now. We can check if it's equal to its > ->d_parent, though. dentry->d_parent is *NOT* stable > at that point. It might be changing right now. > > However, the first store to dentry->d_parent making it > not equal to parent would have been done under parent->d_lock. > And since we are holding parent->d_lock, we won't miss that > store. We might miss subsequent ones, but if we observe > dentry->d_parent == parent, we know that it's stable. And > if we see dentry->d_parent != parent, we know that dentry > has moved around and we need to retry anyway. Why isn't it necessary to use READ_ONCE(dentry->d_parent) here? if (unlikely(parent != dentry->d_parent)) { Suppose 'parent' is 0xAAAABBBB, and 'dentry->d_parent' is 0xAAAAAAAA and is concurrently changed to 0xBBBBBBBB. d_parent could be read in two parts, 0xAAAA then 0xBBBB, resulting in it appearing that d_parent == 0xAAAABBBB == parent. Yes it won't really be compiled as that in practice, but I thought the point of READ_ONCE() is to *guarantee* it's really done right... - Eric