Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6)

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Wed, 13 Jul 2011 02:39:36 +0100

On Wed, Jul 13, 2011 at 01:56:34AM +0100, Al Viro wrote:

> Nick, could you please describe the locking rules you had in mind for
> ->d_lock?  unlazy_walk() (aka nameidata_dentry_drop_rcu()) can probably
> be dealt with by checking d_seq twice, once before locking the child.
> Then we could be sure that it's still a child of parent and will stay
> so as long as parent's ->d_lock is held, and thus the ordering would
> stay stable...

As the matter of fact, can we ever get there with IS_ROOT(dentry)?
AFAICS, that should be impossible - dentry->d_seq would have to be
changed by whatever had torn it off the tree and we would have
buggered off on __d_rcu_to_refcount() failing...

AFAICS, the only way to get there would be with mountpoint crossing
returning a symlink with symlink already killed by rename() somehow
(call in walk_component()).  The first part should be impossible -
symlinks can't be mounted/bound on anything (and if it would be
possible, we'd trigger that BUG_ON() if symlink was still alive,
anyway).

So here's what I want to do to unlazy_walk(); it'll almost certainly
leave other problems with ->d_lock, but at least it'll take care of
that one:

Make sure that child is still a child of parent before nested locking
of child->d_lock in unlazy_walk(); otherwise we are risking a violation
of locking order and deadlocks.

Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
---

diff --git a/fs/namei.c b/fs/namei.c
index 0223c41..5c867dd 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -433,6 +433,8 @@ static int unlazy_walk(struct nameidata *nd, struct dentry *dentry)
 			goto err_parent;
 		BUG_ON(nd->inode != parent->d_inode);
 	} else {
+		if (dentry->d_parent != parent)
+			goto err_parent;
 		spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
 		if (!__d_rcu_to_refcount(dentry, nd->seq))
 			goto err_child;
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html