On Tue, Jul 12, 2011 at 05:04:55PM -0700, Linus Torvalds wrote: > > All flakiness of the locking "order" aside, here we simply lock two dentries > > that might be nowhere near each other by now. ?Hell, by that point the parent > > might've been moved under (what used to be) child. ?Or it might have address > > greater than that of child and be not an ancestor anymore. ?Note that no > > i_mutex, etc. is held at that point, so there's no external serialization > > to save our arses... > > So why wouldn't it be sufficient to just take the s_vfs_rename_mutex > in the caller? We're only talking d_materialise_unique(), no? Alas, no. d_materialize_unique() aside (we have a couple of bugs in there), ->d_lock handling is fscked up. Basically, it's unlazy_walk() taking ->d_lock instances without anything to guarantee that it's doing that in order consistent with d_move() (and any other users, for that matter). *And* the "locking order" in question is not transitive, but that's a separate story. And no, we obviously are not going to serialize the switch from RCU to normal pathwalk on a per-fs mutex... I'm crawling through the d_lock users right now (>400 places ;-/), trying to put together reasonable locking rules. FWIW, we used to have this for d_move() et.al.: * all places changing ->d_parent hold i_mutex on parent(s) * if parents differ, we have ->s_vfs_rename_mutex as well * if old_dentry was not attached to the tree, it was positive and a subdirectory of new_dentry->d_parent. Said new_dentry->d_parent was been locked by caller. Unfortunately, current tree has these rules violated in several places. And these rules have nothing to say about ->d_lock ;-/ Nick, could you please describe the locking rules you had in mind for ->d_lock? unlazy_walk() (aka nameidata_dentry_drop_rcu()) can probably be dealt with by checking d_seq twice, once before locking the child. Then we could be sure that it's still a child of parent and will stay so as long as parent's ->d_lock is held, and thus the ordering would stay stable... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html