This is a note to let you know that I've just added the patch titled shrink_dentry_list(): take parent's ->d_lock earlier to the 3.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: shrink_dentry_list-take-parent-s-d_lock-earlier.patch and it can be found in the queue-3.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 046b961b45f93a92e4c70525a12f3d378bced130 Mon Sep 17 00:00:00 2001 From: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Date: Thu, 29 May 2014 08:54:52 -0400 Subject: shrink_dentry_list(): take parent's ->d_lock earlier From: Al Viro <viro@xxxxxxxxxxxxxxxxxx> commit 046b961b45f93a92e4c70525a12f3d378bced130 upstream. The cause of livelocks there is that we are taking ->d_lock on dentry and its parent in the wrong order, forcing us to use trylock on the parent's one. d_walk() takes them in the right order, and unfortunately it's not hard to create a situation when shrink_dentry_list() can't make progress since trylock keeps failing, and shrink_dcache_parent() or check_submounts_and_drop() keeps calling d_walk() disrupting the very shrink_dentry_list() it's waiting for. Solution is straightforward - if that trylock fails, let's unlock the dentry itself and take locks in the right order. We need to stabilize ->d_parent without holding ->d_lock, but that's doable using RCU. And we'd better do that in the very beginning of the loop in shrink_dentry_list(), since the checks on refcount, etc. would need to be redone anyway. That deals with a half of the problem - killing dentries on the shrink list itself. Another one (dropping their parents) is in the next commit. locking parent is interesting - it would be easy to do rcu_read_lock(), lock whatever we think is a parent, lock dentry itself and check if the parent is still the right one. Except that we need to check that *before* locking the dentry, or we are risking taking ->d_lock out of order. Fortunately, once the D1 is locked, we can check if D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent can start or stop pointing to D1 only under D1->d_lock, so taking D1->d_lock is enough. In other words, the right solution is rcu_read_lock/lock what looks like parent right now/check if it's still our parent/rcu_read_unlock/lock the child. Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/dcache.c | 53 +++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 41 insertions(+), 12 deletions(-) --- a/fs/dcache.c +++ b/fs/dcache.c @@ -528,6 +528,38 @@ failed: return dentry; /* try again with same dentry */ } +static inline struct dentry *lock_parent(struct dentry *dentry) +{ + struct dentry *parent = dentry->d_parent; + if (IS_ROOT(dentry)) + return NULL; + if (likely(spin_trylock(&parent->d_lock))) + return parent; + spin_unlock(&dentry->d_lock); + rcu_read_lock(); +again: + parent = ACCESS_ONCE(dentry->d_parent); + spin_lock(&parent->d_lock); + /* + * We can't blindly lock dentry until we are sure + * that we won't violate the locking order. + * Any changes of dentry->d_parent must have + * been done with parent->d_lock held, so + * spin_lock() above is enough of a barrier + * for checking if it's still our child. + */ + if (unlikely(parent != dentry->d_parent)) { + spin_unlock(&parent->d_lock); + goto again; + } + rcu_read_unlock(); + if (parent != dentry) + spin_lock(&dentry->d_lock); + else + parent = NULL; + return parent; +} + /* * This is dput * @@ -805,6 +837,8 @@ static void shrink_dentry_list(struct li struct inode *inode; dentry = list_entry(list->prev, struct dentry, d_lru); spin_lock(&dentry->d_lock); + parent = lock_parent(dentry); + /* * The dispose list is isolated and dentries are not accounted * to the LRU here, so we can simply remove it from the list @@ -818,6 +852,8 @@ static void shrink_dentry_list(struct li */ if ((int)dentry->d_lockref.count > 0) { spin_unlock(&dentry->d_lock); + if (parent) + spin_unlock(&parent->d_lock); continue; } @@ -825,6 +861,8 @@ static void shrink_dentry_list(struct li if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) { bool can_free = dentry->d_flags & DCACHE_MAY_FREE; spin_unlock(&dentry->d_lock); + if (parent) + spin_unlock(&parent->d_lock); if (can_free) dentry_free(dentry); continue; @@ -834,22 +872,13 @@ static void shrink_dentry_list(struct li if (inode && unlikely(!spin_trylock(&inode->i_lock))) { d_shrink_add(dentry, list); spin_unlock(&dentry->d_lock); + if (parent) + spin_unlock(&parent->d_lock); continue; } - parent = NULL; - if (!IS_ROOT(dentry)) { - parent = dentry->d_parent; - if (unlikely(!spin_trylock(&parent->d_lock))) { - if (inode) - spin_unlock(&inode->i_lock); - d_shrink_add(dentry, list); - spin_unlock(&dentry->d_lock); - continue; - } - } - __dentry_kill(dentry); + /* * We need to prune ancestors too. This is necessary to prevent * quadratic behavior of shrink_dcache_parent(), but is also Patches currently in stable-queue which might be from viro@xxxxxxxxxxxxxxxxxx are queue-3.14/lock_parent-don-t-step-on-stale-d_parent-of-all-but-freed-one.patch queue-3.14/lift-the-already-marked-killed-case-into-shrink_dentry_list.patch queue-3.14/expand-dentry_kill-dentry-0-in-shrink_dentry_list.patch queue-3.14/bcache-fix-a-leak-in-bch_cached_dev_run.patch queue-3.14/dcache-add-missing-lockdep-annotation.patch queue-3.14/dealing-with-the-rest-of-shrink_dentry_list-livelock.patch queue-3.14/split-dentry_kill.patch queue-3.14/shrink_dentry_list-take-parent-s-d_lock-earlier.patch queue-3.14/dentry_kill-doesn-t-need-the-second-argument-now.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html