On Fri, Jun 21, 2013 at 11:14:20AM -0400, Dwight Engen wrote: > On Fri, 21 Jun 2013 08:03:11 +1000 > Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > On Thu, Jun 20, 2013 at 09:54:10AM -0400, Dwight Engen wrote: > > > On Thu, 20 Jun 2013 10:13:41 +1000 > > > Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > > > > On Wed, Jun 19, 2013 at 11:09:48AM -0400, Dwight Engen wrote: > > > > > Use uint32 from init_user_ns for xfs internal uid/gid > > > > > representation in acl, xfs_icdinode. Conversion of kuid/gid is > > > > > done at the vfs boundary, other user visible xfs specific > > > > > interfaces (bulkstat, eofblocks filter) expect uint32 > > > > > init_user_ns uid/gid values. > > > > > > > > It's minimal, but I'm not sure it's complete. I'll comment on that > > > > in response to Eric's comments... > > ... > > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > > > > index 7f7be5f..8049976 100644 > > > > > --- a/fs/xfs/xfs_inode.c > > > > > +++ b/fs/xfs/xfs_inode.c > > > > > @@ -1268,8 +1268,8 @@ xfs_ialloc( > > > > > ip->i_d.di_onlink = 0; > > > > > ip->i_d.di_nlink = nlink; > > > > > ASSERT(ip->i_d.di_nlink == nlink); > > > > > - ip->i_d.di_uid = current_fsuid(); > > > > > - ip->i_d.di_gid = current_fsgid(); > > > > > + ip->i_d.di_uid = from_kuid(&init_user_ns, > > > > > current_fsuid()); > > > > > + ip->i_d.di_gid = from_kgid(&init_user_ns, > > > > > current_fsgid()); > > > > > > > > Why all new inodes be created in the init_user_ns? Shouldn't this > > > > be current_user_ns()? > > > > > > current_fsuid() is the kuid_t from whatever userns current is in, > > > > Yes. > > > > > which we are converting to a flat uint32 since i_d is the on disk > > > inode. > > I was incorrect here. current_fsuid() is the kuid_t from the > perspective of init_user_ns, so the value won't be different, just the > type. I guess I'm not the only that is easily confused by this convoluted, badly documented namespace conversion crap either.... > > So, init_user_ns is actually the target namespace of the conversion, > > not the namespace we are converting *from*? > > > > <sigh> > > > > I *knew* this was going to happen when I first saw this code: > > > > http://www.spinics.net/lists/netdev/msg209217.html > > > > "For those of us that have to look at it once every few months, > > following the same conventions as all the other code in the kernel > > (i.e. kqid_to_id()) tells me everything I need to know without > > having to go through the process of looking up the unusual > > from_kqid() function and then from_kuid() to find out what it is > > actually doing...." > > > > Here I am, several months later and trying to work out what the damn > > conversion from_kgid() and make_kuid() are supposed to be doing and > > trying to work out if it is correct or not... > > > > > This field is then used in xfs_setup_inode() to populate > > > VFS_I(ip)->i_uid. > > > > But that then uses make_kuid(&init_user_ns, ip->i_d.di_uid) which > > according to the documentation creates a kuid in the init_user_ns, > > not in the current user namespace. > > Right, inside the kernel uids are all interpreted within > init_user_ns context and are only converted to a different value if the > caller is in some other userns at the kernel/user boundary (ie. > cp_compat_stat()). Maybe it is more clear to say that init_user_ns is > the "flat 32bit uid space" and thus has a mapping for every uid. So > the value in init_user_ns will actually be equal to the di_uid value. If that's the case, then why the hell do filesystems need to know *anything* about namespaces and conversions? It shoul dbe *completely* hidden from filesystem code with simple wrappers rather than open coding init_user_ns conversions everywhere.... Indeed, for updating or getting the uid/gid from the struct inode there are these helpers: i_uid_read/i_uid_write i_gid_read/i_uid_write but there's nothing generic for the filesystems to use that just take a kuid_t/kgid_t and return a raw value or vice versa. Nice. > > So if we then do uid_eq(current_fsuid(), VFS_I(ip)->i_uid) on the > > newly created and initialised inode they are different, yes? If they > > are different, then this code is not correct.... > > No they should equal because xfs_setup_inode() maps it back, see below. > > > > Most other filesystems would use inode_init_owner(), > > > but xfs does not (I assume because it wants to handle SGID itself > > > based on XFS_INHERIT_GID and irix_sgid_inherit). > > > > No, XFS doesn't use inode_init_owner() because we are initialising > > the on disk XFS inode here, not the VFS struct inode... > > Right, but I was referring to xfs_setup_inode() which is setting up the > VFS inode. It is called in both the from disk and new inode paths, and > sets the VFS inode uid based on the value in the disk inode. So perhaps your comment was in the wrong place? ;) > I was > trying to show that in the new inode path the VFS inode uid is being > initialized the same as on other filesystems (ie from current_fsuid()), > but with an extra step just because of "conversion" to and back from > init_user_ns. The call sequence in xfs: > > xfs_ialloc(): > current_fsuid() -> from_kuid() -> di_uid ip->i_d.di_uid = xfs_kuid_to_disk(current_fsuid()); > xfs_setup_inode(): > di_uid -> make_kuid() -> inode.i_uid == current_fsuid() inode->i_uid = xfs_kuid_from_disk(ip->i_d.di_uid); > Sorry, I think my first explanation wasn't clear, hopefully this is. Well, it's taken me another hour of battling through the undocumented crap that is this namespace code starting from setuid(), but I understand the conversions being done now. I dislike the implementation even more now, and I'm still wondering what the hell we are supposed to do with bulkstat, filehandles and stuff like xfs_fsr, backups, etc... > > So, to prevent me from wondering what it is doing in another 6 > > months time, can you add a set of helper functions that are named: > > > > xfs_kuid_to_disk() > > xfs_kuid_from_disk() > > xfs_kgid_to_disk() > > xfs_kgid_from_disk() > > > > and document why we are using the namespaces that are being used, > > and then use them where we convert to/from the different inode > > structures? > > Sure I can take a shot at that, and I'm guessing you would prefer to use > > inode->i_uid = xfs_kuid_from_disk(ip->i_d.di_uid); > > over > > i_uid_write(inode, ip->i_d.di_uid); Yes, because i_uid_write() is not a generic helper function, and we convert to/from disk format in many places where we aren't actually reading/writing the struct inode. i.e. I don't see any reason at all for having the variable init_user_ns anywhere in the XFS code except than in those wrapper functions.... > > FWIW, what happens when ip->i_d.di_gid doesn't have a mapping in the > > current namespace, and make_kuid/make_kgid return > > INVALID_UID/INVALID_GID? Is this is going to happen, and if it does > > what do we need to do about it? That will need to be added to the > > comments, too. > > I don't think it will happen here because init_user_ns has a mapping > for all values. Where it can happen is when there is a conversion to > some subset userns, that doesn't have a mapping for the value. Yup, understood. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs