Re: [PATCH] vfs: fix race in rcu lookup of pruned dentry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jul 17, 2011 at 11:31 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Now, I do agree that maybe that case simply should check the dentry
> sequence count. I wish all cases did. Hugh patch did that. But the
> reason I dislike Hugh's patch is that when I say "I wish they all
> did", I mean that I dislike the special casing. And Hugh's patch just
> adds *more* special casing for that NULL entry - I'd wish we just
> always did it regardless of whether it was NULL or not.

Btw, looking at that, I think Hugh's patch is wrong. It does

  if (!read_seqcount_retry(&dentry->d_seq, nd->seq))

but that's after we've done the __follow_mount_rcu() that may actually
have changed "nd->seq" to the mount-point inode (and has changed
path->dentry to match it).

Now, it only does it if inode is NULL, so I guess it doesn't matter,
but it's the kind of inconsistency that I think is really dangerous,
because it basically compares incompatible sequence numbers.

Also, looking at that whole mount-point traversal sequence, it looks
like __follow_mount_rcu() will happily totally ignore the old sequence
number when it replaces it with the mount-point sequence number. So it
looks to me that we have a case where we miss the sequence number
check that can happen with a positive dentry too!

No?

So I think that whenever we change "nd->seq", we should always heck
the previous sequence number first (the way do_lookup() itself does
for the *normal* traversal case). Otherwise we will have traversed the
mount-point without ever having checked the previous sequence number.

Something like the (untested) attached patch.

Comments? This mount-point case is independent of the negative dentry
issue, and probably never really an issue in practice, but...

                          Linus
 fs/namei.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 5c867dd1c0b3..f78d2af3aef4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -938,6 +938,8 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path,
 {
 	for (;;) {
 		struct vfsmount *mounted;
+		unsigned int seq;
+
 		/*
 		 * Don't forget we might have a non-mountpoint managed dentry
 		 * that wants to block transit.
@@ -952,9 +954,18 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path,
 		mounted = __lookup_mnt(path->mnt, path->dentry, 1);
 		if (!mounted)
 			break;
+		seq = read_seqcount_begin(&mounted->mnt_root->d_seq);
+
+		/*
+		 * The memory barrier in read_seqcount_begin() is sufficient,
+		 * so we can use __read_seqcount_retry() to check the prev
+		 * sequence numbers.
+		 */
+		if (!__read_seqcount_retry(&path->dentry->d_seq, nd->seq))
+			return false;
 		path->mnt = mounted;
 		path->dentry = mounted->mnt_root;
-		nd->seq = read_seqcount_begin(&path->dentry->d_seq);
+		nd->seq = seq;
 	}
 	return true;
 }

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux