On Fri, Jan 09, 2015 at 11:36:44PM +0000, Al Viro wrote: > On Fri, Jan 09, 2015 at 06:12:48PM -0500, Rich Felker wrote: > > > I'm not sure where you're disagreeing with me. open of procfs symlinks > > does not resolve the symlink and open the resulting pathname. They are > > "magic symlinks" which are bound to the inode of the open file. I > > don't see why this action, which is already special for magic > > symlinks, can't check a flag on the magic symlink and possibly close > > the corresponding file descriptor as part of its action. > > _What_ action? ->follow_link()? As in "the same thing that e.g. > stat(2) would trigger"? To elaborate a bit: the fundamental method for symlink traversal is ->follow_link(). It gets dentry of the object itself + opaque context. Usually it just obtains some string (== symlink contents) and calls nd_set_link(context, string). In that case the string will be interpreted by its callers in usual way. Another possibility is to call nd_jump_link(context, location), which will reset the current position (directory in which the symlink has been found and relative to which it would be interpreted) to given location in tree. It might actually do both - then the string will be interpreted relative to the new location. Once the pathname resolution is done with the string stored by nd_set_link(), it calls another method - ->put_link(). That one releases the object that contains this string; it gets an opaque pointer returned by ->follow_link(). Returning ERR_PTR(-Esomething) indicates an error, so does nd_set_link(context, ERR_PTR(-Esomething)). readlink(2) is using a different method (->readlink()) and any object whose ->follow_link() only uses nd_set_link() can use generic_readlink as its ->readlink instance - that will call ->follow_link(), copy the string stored by nd_set_link() to userland buffer and use ->put_link() to release whatever needs to be released. Most of the symlinks are doing just that. procfs "magical" symlinks have ->follow_link() that uses nd_jump_link(); they obviously can't use generic_readlink() (there is no string left by ->follow_link() for caller to traverse), so they have non-standard ->readlink() instances - ones that use d_path() to generate a plausible pathname of the would-be destination of their ->follow_link(). Or something like pipe:[696969], etc. Note, however, that ->readlink() is used only by readlink(2) syscall; as far as pathname resolution is concerned it is completely irrelevant. What matters is ->follow_link(). Now, the callers do not know (and do not care) what a particular symlink _is_. A symlink is just a dentry with inode that has non-NULL ->follow_link() method. That's it. Moreover, _any_ pathname resolution is using the same method for symlink traversal, be it open(2), stat(2), whatever. If a symlink is to be traversed, that's it - the only choice VFS has is whether to traverse it at all or not (think of stat(2) vs lstat(2) difference, or O_NOFOLLOW, etc.) _After_ the traversal it's too late to do this sort of thing - after all, how do you tell if your current position had been set by the traversal of your symlink or that of any normal /proc/self/fd/<n>? And doing that _during_ the traversal would really suck - stray ls -lR /proc could race with that open() done by script interpreter. It might be possible to work around that, but trying that rapidly gets into very ugly territory, *especially* since the handling of the final component of open(2) (fs/namei.c:do_last()) is already far too convoluted. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html