On Thu, 22 Dec 2011, Alex Elder wrote: > I'll send all this to ceph-devel later, I just wanted [added cc] > to document it while it's fresh and send it to *someone* > and you made this change so I'm sending it to you. The > change in question is: > > be655596b3de5873f994ddbe205751a5ffb4de39 > ceph: use i_ceph_lock instead of i_lock > > First, after updating my ceph tree and rebuilding I found > that every time I completed a mount in my UML environment > it would immediately crash the UML. > > This led to me learning how to attach a debugger to > the UML (good) and this morning I was able to capture > this situation in gdb. > > The problem is in ceph_dir_set_complete(), which is > blindly dereferencing the return value of > __d_find_any_alias(inode), and that function is > returning NULL because the inode's i_dentry list > is empty. > > So that's problem 1. Yeah, looks like I broke that with 9f6f8f50663e6cf42cca64f1f5ee0ca438c33c46. Just moving the dentry->d_sb inside the if should do it! Strange that this didn't trigger on sepia... :/ > In looking at this I find that __d_find_any_alias() > is duplicated in ceph, to (almost) match the Linux > implementation which is not exported. But unlike > the Linux version, the ceph version does NOT grab > a reference to the found alias. I consider this a > bug, because a function with an identical name should > perform identically to the original, otherwise it > should be named something different. > > That's problem 2. > Furthermore, I looked at the locking that's done for > the Linux version of __d_find_any_alias, and that > truly does require taking the Linux inode lock--the > (new) ceph inode i_ceph_lock is not sufficient. Note > that __d_find_any_alias() really only operates on the > Linux inode and does nothing with the ceph inode, so > it may be that simply changing the locking used *in > this case* back to using the Linux inode lock is > enough to fix the problem. But I'm not familiar > enough at this point to know the locking context > where these calls are made, so can't be sure doing > this would cause problems (like lock inversions). > > There are three places this __d_find_any_alias() > is called: ceph_dir_{set,clear,test}_complete(). > Those are straightforward, but the calls to *those* > functions are mostly embedded in long strings of > logical conditions so it's going to take a little > work to unwind this. > > So that's problem 3, which is the biggest one. I think the __d_find_any_alias() needs to take the i_lock and a reference to make it fully safe. i_lock is ordered inside i_ceph_lock, now, so that should be safe in all of the call chains. Better yet, we should just export the VFS version and use that. What do you think? A simple fix is best so we can get it into 3.2 (which is about a week away?). > Potential problem 4 is that this suggests the > mechanical switch to using i_ceph_lock is a little > suspect and might need a closer look. Yeah. Well, this is one of the few cases where i_lock where we muddle with the dcache internal structure. The others should pretty much all be in dir.c, and maybe inode.c's splice_dentry() and friends. The dcache readdir is by far the most suspect of all of those, but also probably the lowest priority at the moment. Maybe your visit in January would be a good time to go over it in detail? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html