On Sat, 2024-02-17 at 15:56 -0500, Kent Overstreet wrote: > AKA - integer identifiers considered harmful > > Any time you've got a namespace that's just integers, if you ever end > up needing to subdivide it you're going to have a bad time. > > This comes up all over the place - for another example, consider > ioctl numbering, where keeping them organized and collision free is a > major headache. > > For UIDs, we need to be able to subdivide the UID namespace for e.g. > containers and mounting filesystems as an unprivileged user - but > since we just have an integer identifier, this requires complicated > remapping and updating and maintaining a global table. > > Subdividing a UID to create new permissions domains should be a > cheap, easy operation, and it's not. > > The solution (originally from plan9, of course) is - UIDs shouldn't > be numbers, they should be strings; and additionally, the strings > should be paths. > > Then, if 'alice' is a user, 'alice.foo' and 'alice.bar' would be > subusers, created by alice without any privileged operations or > mucking with outside system state, and 'alice' would be superuser > w.r.t. 'alice.foo' and 'alice.bar'. > > What's this get us? I would have to say that changing kuid for a string doesn't really buy us anything except a load of complexity for no very real gain. However, since the current kuid is u32 and exposed uid is u16 and there is already a proposal to make use of this somewhat in the way you envision, there might be a possibility to re-express kuid as an array of u16s without much disruption. Each adjacent pair could represent the owner at the top and the userns assigned uid underneath. That would neatly solve the nesting problem the current upper 16 bits proposal has. However, neither proposal would get us out of the problem of mount mapping because we'd have to keep the filesystem permission check on the owning uid unless told otherwise. > Much better, easier to use sandboxing - and maybe we can kill off a > _whole_ lot of other stuff, too. > > Apparmour and selinux are fundamentally just about sandboxing > programs so they can't own everything owned by the user they're run > by. > > But if we have an easy way to say "exec this program as a subuser of > the current user..." > > Then we can control what that program can access with just our > existing UNIX permission and acls. > > This would be a pretty radical change, and there's a number of things > to explore - lots of brainstorming to do. > > - How can we do this without breaking absolutely everything? > Obviously, > any syscalls that communicate in terms of UIDs and GIDs are a > problem; can we come up with a compat layer so that most stuff > more > or less still works? > > - How can we do this a way that's the most orthogonal, that gets us > the > most bang for our buck? How can we kill off as much security model > stupidity as possible? How can we make sandboxing _dead easy_ for > new > applications? So all of the above could be covered by a u16 kuid array with the last element exposed to the user as the uid. However, there are still problems even with that approach: the unmapped uid/gid is something some containers rely on and, as I said above, the mount mapping still would have to be admin assigned. James