On Sun, May 09, 2021 at 05:58:22PM -0500, Eric W. Biederman wrote: > Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > > > On Sat, May 08, 2021 at 10:46:23PM +0000, Al Viro wrote: > >> On Sat, May 08, 2021 at 03:17:44PM -0700, Linus Torvalds wrote: > >> > On Sat, May 8, 2021 at 2:06 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > >> > > > >> > > On Sat, May 08, 2021 at 01:39:45PM -0700, Linus Torvalds wrote: > >> > > > >> > > > +static inline int prepend_entries(struct prepend_buffer *b, const struct path *path, const struct path *root, struct mount *mnt) > >> > > > >> > > If anything, s/path/dentry/, since vfsmnt here will be equal to &mnt->mnt all along. > >> > > >> > Too subtle for me. > >> > > >> > And is it? Because mnt is from > >> > > >> > mnt = real_mount(path->mnt); > >> > > >> > earlier, while vfsmount is plain "path->mnt". > >> > >> static inline struct mount *real_mount(struct vfsmount *mnt) > >> { > >> return container_of(mnt, struct mount, mnt); > >> } > > > > Basically, struct vfsmount instances are always embedded into struct mount ones. > > All information about the mount tree is in the latter (and is visible only if > > you manage to include fs/mount.h); here we want to walk towards root, so... > > > > Rationale: a lot places use struct vfsmount pointers, but they've no need to > > access all that stuff. So struct vfsmount got trimmed down, with most of the > > things that used to be there migrating into the containing structure. > > > > [Christian Browner Cc'd] > > BTW, WTF do we have struct mount.user_ns and struct vfsmount.mnt_userns? > > Can they ever be different? Christian? > > I presume you are asking about struct mnt_namespace.user_ns and > struct vfsmount.mnt_userns. > > That must the idmapped mounts work. > > In short mnt_namespace.user_ns is the user namespace that owns > the mount namespace. > > vfsmount.mnt_userns functionally could be reduced to just some struct > uid_gid_map structures hanging off the vfsmount. It's purpose is No. The userns can in the future be used for permission checking when delegating features per mount. > to add a generic translation of uids and gids on from the filesystem > view to the what we want to show userspace. > > That code could probably benefit from some refactoring so it is clearer, > and some serious fixes. I reported it earlier but it looks like there > is some real breakage in chown if you use idmapped mounts. You mentioned something about chown already some weeks ago here [1] and never provided any details or reproducer for it. This code is extensively covered by xfstests and systemd and others are already using it so far without any issues reported by users. If there is an issue, it'd be good to fix them and see the tests changed to cover that particular case. [1]: https://lore.kernel.org/lkml/20210213130042.828076-1-christian.brauner@xxxxxxxxxx/T/#m3a9df31aa183e8797c70bc193040adfd601399ad