On 2010-07-02, at 16:09, Neil Brown wrote: > On Fri, 2 Jul 2010 10:12:47 -0600 > Andreas Dilger <andreas.dilger@xxxxxxxxxx> wrote: >> >> I haven't looked at this part of the VFS in a while, but it looks like reconnect_path() is an implementation issue specific to knfsd, and shouldn't be needed for regular files. i.e. if exportfs_encode_fh() is never used on a disconnected file, then this overhead is not incurred. >> >> The above use of open_by_handle() is not for userspace NFS/Samba re-export, but to allow applications to open regular files for IO. > > Firstly it is needed for directories so that the VFS can effectively lock > against directory rename races which could otherwise create disconnected > subtrees (where the first parent is a member only of one of its > descendants). So if you get a filehandle for a directory it *must* be > properly connected to the root for rename to be safe. This operation is > faster than a full path lookup if the dentry is already is cache, and slower > if it and any of the path is not in cache. OK, so this requirement is specific for directories, and not at all needed for regular files. > Secondly it is needed if you want to enforce the rule that the contents of a > directory are only accessible if the 'x' bit on the directory is set. > kNFSd does not enforce this (unless subtree_check is specified), partly > because it is hard to do correctly and partly because we have to trust the > client any, so trusting it to check the 'x' bit is very little extra trust. If the application that called name_to_handle() already had to traverse the whole pathname to get the file handle, then there shouldn't necessarily be a requirement to do this when calling open_by_handle(). The only possible permission checking in open_by_handle() is the permission on the inode itself. > Note that it is not possible to reliably perform filehandle lookup for > non-directories if you need a fully reconnected dentry, as > cross-directory-renames can confuse the situation beyond recovery. For normal file IO, a fully connected dentry is not needed, and in fact the handle_to_path->exportfs_decode_fh() code will accept any inode alias for reguar file use. > Maybe open-by-handle should require DAC_OVERRIDE, or maybe a new > DAC_X_OVERRIDE. And if those aren't provided it only works for directories. That's the big question. If the file handle has some "non-public" information in it (i.e. a capability that cannot be (easily) guessed or forged), then there should not be any need for DAC_OVERRIDE. This could easily be enforced if there was a provision for "short term" file handles that only had to live a few minutes or less, so the kernel could just store a random cookie in each file handle and require applications to get a new handle if the cookie expires or the server crashes. However, even a "plain" file handle containing only the inode/generation is relatively secure in this respect, since the only way to get the inode number of a particular file is "ls -li" (which either assumes path "x" traversal permission, OR guessing the inode number), and ioctl(FS_IOC_GETVERSION) which requires being able to open the inode already. Guessing the inode number by itself is fairly weak, at most 2^32 inodes in most filesystems, usually far fewer. Guessing the generation number is much harder (though not impossible). Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html