On Wed, May 3, 2017 at 6:04 PM, David Howells <dhowells@xxxxxxxxxx> wrote: > > Here are a set of patches to create a mount context prior to setting up a > new mount, populating it with the parsed options/binary data and then > effecting the mount. Great work, thanks for taking this on. I'd argue with some design decisions here. One of the motivations for doing the mount API overhaul is to create clear distinction between separate functions like: - creating filesystem instance (aka superblock) - attaching filesystem instance into mount tree - reconfiguring superblock - changing mount properties This patchset achieves this partly, but the separation is far from crisp clear... First of all why is fsopen() creating a "mount context"? It's suppsed to create a "superblock creation context". And indeed, there are mount flags and root path in there, which are definitely not necessary for creating a super block. Is there a good reason why these mount specific properties leaked into the object created by fsopen()? Also I'd expect all context ops to be fully generic first. I.e. no filesystem code needs to be touched to make the new interface work. The context would just build the option string and when everything is ready (probably need a "commit" command) then it would go off and call mount_fs() to create the superblock and attach it to the context. Then, when that works, we could add context ops, so the filesystem can do various things along the way, which is the other reason we want this. And in the end it would allow gradual migration to a new superblock creation api and phasing out the old one. But that shouldn't be observable on either the old or the new userspace interfaces. > This allows namespaces and other information to be conveyed through the > mount procedure. It also allows extra error information to be returned > (so many things can go wrong during a mount that a small integer isn't > really sufficient to convey the issue). > > This also allows Miklós Szeredi's idea of doing: > > fd = fsopen("nfs"); > write(fd, "option=val", ...); > fsmount(fd, "/mnt"); > > that he presented at LSF-2017 to be implemented (see the relevant patches > in the series), to which I can add: > > read(fd, error_buffer, ...); > > to read back any error message. I didn't use netlink as that would make it > depend on CONFIG_NET and would introduce network namespacing issues. > > I've implemented mount context handling for procfs and nfs. > > Further developments: > > (*) Implement mount context support in more filesystems, ext4 being next > on my list. > > (*) Move the walk-from-root stuff that nfs has to generic code so that you > can do something akin to: > > mount /dev/sda1:/foo/bar /mnt > > See nfs_follow_remote_path() and mount_subtree(). This is slightly > tricky in NFS as we have to prevent referral loops. First we can limit this feature to non-weird (ie. no managed dentries) subtrees. > > (*) Move the pid_ns pointer from struct mount_context to struct > proc_mount_context as I'm not sure it's necessary for anything other > than procfs. > > (*) Work out how to get at the error message incurred by submounts > encountered during nfs_follow_remote_path(). > > Should the error message be moved to task_struct and made more > general, perhaps retrieved with a prctl() function? > > (*) Clean up/consolidate the security functions. Possibly add a > validation hook to be called at the same time as the mount context > validate op. > > The patches can be found here also: > > http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=mount-context Will try to review the actual patches next week. Thanks, Miklos