Hello, On Thu, Jan 13, 2011 at 02:54:30PM +0100, Thomas Gleixner wrote: > The dcache scalability work broke NFS root filesystems. > > "cd /" results in the following problem: > > link_path_walk("/",...); > jumps to return_reval > need_reval_dot() returns true for NFS > d_revalidate() > dentry->d_op->d_revalidate(dentry, nd); > returns -ECHILD due to nd->flags & LOOKUP_RCU > nameidata_dentry_drop_rcu() > spin_lock(&parent->d_lock); > spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); > > This deadlocks because dentry == parent > > This problem exists for any filesystem which implements d_revalidate. > > Uwe bisected is down to commit 34286d6(fs: rcu-walk aware d_revalidate > method), but reverting that patch causes different wreckage to show up. > > Check for parent equal dentry and skip the nested lock to avoid the > deadlock. I'm sure this is the wrong fix, but at least it "works" :) > > Reported-by: Uwe Kleine-Koenig <u.kleine-koenig@xxxxxxxxxxxxxx> > Reported-by: "Ramirez Luna, Omar" <omar.ramirez@xxxxxx> > Not-Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > --- > fs/namei.c | 4 ++++ > 1 file changed, 4 insertions(+) > > Index: linux-2.6/fs/namei.c > =================================================================== > --- linux-2.6.orig/fs/namei.c > +++ linux-2.6/fs/namei.c > @@ -487,6 +487,8 @@ static int nameidata_dentry_drop_rcu(str > goto err_root; > } > spin_lock(&parent->d_lock); > + if (parent == dentry) > + goto same; > spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); > if (!__d_rcu_to_refcount(dentry, nd->seq)) > goto err; > @@ -499,6 +501,8 @@ static int nameidata_dentry_drop_rcu(str > BUG_ON(!parent->d_count); > parent->d_count++; > spin_unlock(&dentry->d_lock); > + > +same: > spin_unlock(&parent->d_lock); > if (nd->root.mnt) { > path_get(&nd->root); > Note there is a different patch available in the thread here: http://thread.gmane.org/gmane.linux.kernel/1087013/focus=1087048 The differences are that it tests for IS_ROOT(dentry) instead of parent == dentry (which looks more reasonable IMVHO) and that it increases parent->d_count even if the test triggered. (And it doesn't skip the BUG_ONs which hopefully doesn't make a difference.) Note I really have no glue about the code below fs/, but I wonder if the toplevel directories of mounts need some treatment here, too. (But I expect that they don't. So I ask just in case ...) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html