On Sun, 08 Nov 2009 02:15:57 -0800 ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote: > Jeff Layton <jlayton@xxxxxxxxxx> writes: > > > On Fri, 6 Nov 2009 20:36:01 +0000 > > Jamie Lokier <jamie@xxxxxxxxxxxxx> wrote: > > > >> Jeff Layton wrote: > >> > The problem here is that this makes that code shortcut any lookup or > >> > revalidation of the dentry. In general, this isn't a problem -- in most > >> > cases the dentry is known to be good. It is a problem however for NFSv4. > >> > If this symlink is followed on an open operation no actual open call > >> > occurs and the open state isn't properly established. This causes > >> > problems when we later try to use this file descriptor for actual > >> > operations. > >> > >> As NFS uses open() as a kind of fcntl-lock barrier, I can see it's > >> important to do _something_ on new opens, rather than just cloning > >> most of the file descriptor. > >> > > > > I guess you mean the close-to-open cache consistency? If so, this > > problem doesn't actually break that. The actual nfs_file_open call does > > occur even when you're opening by following one of these symlinks. I > > believe the cache consistency code occurs there. > > > > The problem here is really nfsv4 specific. There the on-the-wire open > > call and initialization of state actually happens during d_lookup and > > d_revalidate. Neither of these happens with these LAST_BIND symlinks so > > we end up with a filp that has no NFSv4 state attached. > > > >> > This patch takes a minimalist approach to fixing this by making the > >> > /proc/pid follow_link routine revalidate the dentry before returning it. > >> > >> What happens if the file descriptor you are re-opening is for a file > >> which has been deleted. Does it still have a revalidatable dentry? > >> > > > > Well, these LAST_BIND symlinks return a real dget'ed dentry today. If > > we assume that it always returns a valid dentry (which seems to be the > > case), then I suppose it's OK to do a d_revalidate against it. > > > > It's possible though that that revalidate will either fail though or > > return that it's no good. In that case, this code just returns ESTALE > > which should make the path walking code revalidate all the way up the > > chain. That should (hopefully) make whatever syscall we're servicing > > return an error. > > Hmm. Looking at the code I get the impression that a file bind mount > will have exactly the same problem. > > Can you confirm. > > If file bind mounts also have this problem a bugfix to to just > proc seems questionable. > I'm not sure I understand what you mean by "file bind mount". Is that something like mounting with "-o loop" ? I'm not at all opposed to fixing this in a more broad fashion, but as best I can tell, the only place that LAST_BIND is used is in procfs. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html