On Sat, 20 Feb 2010 11:58:34 -0700, Andreas Dilger <adilger@xxxxxxx> wrote: > On 2010-02-18, at 22:42, Aneesh Kumar K.V wrote: > > +static struct dentry *handle_to_dentry(struct path *path, > > + struct file_handle *fh) > > +{ > > + inode = path->dentry->d_inode; > > + if (!S_ISREG(inode->i_mode) && > > + !S_ISDIR(inode->i_mode) && > > + !S_ISLNK(inode->i_mode)) { > > Please follow normal indenting rules, like: > > if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) && > !S_ISLNK(inode->i_mode)) { > > > + /* should we do some check on handle size ?*/ > > + handle = kmalloc(handle_size, GFP_KERNEL); > > Good question. kmalloc() allows up to 32MB allocations, so a user > running 1024 threads calling this with bad handles could OOM most > systems today due to unswappable kernel allocations. Should there be > an upper limit on the handle size, like 4096 bytes or so? Should some > filesystem legitimately need a larger size the kernel can always be > changed without problems since the size isn't part of the ABI (it > should be an internal sanity check), but I can't imagine what needs to > be put into a file handle that would exceed this size. At this point i will limit it to 4096 bytes. > > > + dentry = exportfs_decode_fh(path->mnt, (struct fid *)handle, > > + handle_size, fh->handle_type, > > + vfs_dentry_acceptable, NULL); > > One serious problem I can see with this is that the handle only > encodes the ino+generation of the inode, and nothing about the device > itself. NFS handles this by using the fsid to identify the > filesystem, but it would be highly confusing to applications if a > process with a different CWD gets a different file or an error trying > to open the file handle, or if a new filesystem gets mounted and the > file handle stops working. Having the file handle able to positively > identify a single inode in the system is the most important property > it has, I think. > > Probably one of the important use cases of open_by_handle() is to have > one process doing pathname resolution and slave threads doing work on > those handles. Usually daemon threads change their working directory > to "/" to avoid pinning some directory. Even without that, it would > be impossible for a process to create handles in 2 different > filesystems, which is totally broken: > > (CWD is /home/adilger) > name_to_handle("/usr/src/linux/COPYING", &fh); > > fd = open_by_handle(&fh, O_RDONLY); > (fd = -1; errno = ENOENT or EBADF or whatever) > do_sys_open_by_handle actually takes an fd. I was planning to do an openat style syscall also. That should handle the above problem. In case of the new syscall dirfd will be a file descriptor for a file in the particular filesystem. > Without the handle identifying the filesystem in some way this is > mostly useless. Encoding the device number would be a simple (though > non-robust) way to do this, a better way would be with the filesystem > UUID and adding an in-kernel UUID->sb mapping function (which is > useful for other things as well). > But i understand this is an good requirement. I will see what best I can do. > > +long do_sys_open_by_handle(int dfd, struct file_handle *fh, int > > flags) > > +{ > > + if (!capable(CAP_SYS_ADMIN)) > > + /* Allow open by handle only by sysadmin */ > > + return -EPERM; > > Hmm, I guess this avoids some of the security concerns, but makes it a > lot less useful. I was thinking this could be used for e.g. user NFS > serving or such, but if it is limited to root only then you might as > well just set up the in-kernel NFSd. By making the handle hard to > forge (e.g. generate random key per superblock, sha1(ino+gen+key) and > store that into fh; someone with more security experience can think of > a better scheme) then you can reasonably safely dispense with the > CAP_SYS_ADMIN check because you can be sure that the proper path > traversal has been done by a trusted process and there is no more > exposure than unix socket fd passing. If we make the handle depend on the random key, how do we make sure that after a server crash when get the same inode with the same handle. That would be a requirement for an NFS server right? We can definitely encode device details or UUID details the same way nfs server does in fh_compose. But i guess making it random. What are the issues of being able to guess the file handle ? May be there are other ways to handle those. > > > + if ((!(flags & O_APPEND) || (flags & O_TRUNC)) && > > + (flags & FMODE_WRITE) && IS_APPEND(inode)) { > > Please use normal indenting style, aligning on the parenthesis. -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html