On Tue, 2024-02-20 at 19:25 -0500, Kent Overstreet wrote: > On Mon, Feb 19, 2024 at 09:26:25AM -0500, James Bottomley wrote: > > I would have to say that changing kuid for a string doesn't really > > buy us anything except a load of complexity for no very real gain. > > However, since the current kuid is u32 and exposed uid is u16 and > > there is already a proposal to make use of this somewhat in the way > > you envision, > > Got a link to that proposal? I think this is the latest presentation on it: https://fosdem.org/2024/schedule/event/fosdem-2024-3217-converting-filesystems-to-support-idmapped-mounts/ > > > there might be a possibility to re-express kuid as an array > > of u16s without much disruption. Each adjacent pair could > > represent the owner at the top and the userns assigned uid > > underneath. That would neatly solve the nesting problem the > > current upper 16 bits proposal has. > > At a high level, there's no real difference between a variable length > integer, or a variable length array of integers, or a string. Right, so the advantage is the kernel already does an integer comparison all over the place. > But there's real advantages to getting rid of the string <-> integer > identifier mapping and plumbing strings all the way through: > > - creating a new sub-user can be done with nothing more than the new > username version of setuid(); IOW, we can start a new named > subuser > for e.g. firefox without mucking with _any_ system state or tables > > - sharing filesystems between machines is always a pita because > usernames might be the same but uids never are - let's kill that > off, > please > > Doing anything as big as an array of integers is going to be a major > compatibiltiy break anyways, so we might as well do it right. I'm not really convinced it's right. Strings are trickier to handle and compare than integer arrays and all of the above can be done by either. > Either way we're going to need a mapping to 16 bit uids for > compatibility; doing this right gives userspace an incentive to get > _off_ that compatibility layer so we're not dealing with that > impedence mismatch forever. Fundamentally we have a load of integer to pretty name things we use in the kernel (protocol, port, ...). The point though is the kernel doesn't need to know the pretty name, it deals with integers and user space does the conversion. > > However, neither proposal would get us out of the problem of mount > > mapping because we'd have to keep the filesystem permission check > > on the owning uid unless told otherwise. > > Not sure I follow? Mounting a filesystem inside a userns can cause huge security problems if we map fs root to inner root without the admin blessing it. Think of binding /bin into the userns and then altering one of the root owned binaries as inner root: if the permission check passes, the change appears in system /bin. > We're always going to need mount mapping, but if the mount mapping is > just "usernames here get mapped to this subtree of the system > username namespace", then that potentially simplifies things quite a > bit - the mount mapping is no longer a _table_. But what then is it? If you allow the user arbitrarily to assign subuids, you can't trust them for the mapping to the fs uid. The current newidmap/newgidmap are somewhat nasty but at least they're controlled. I did try a prototype where all we cared about was the root<->root mapping, but a unix system has other uids that are privileged as well, so it didn't solve the security problem. > And it wouldn't have to be administrator assigned. Some administrator > assignment might be required for the username <-> 16 bit uid mapping, > but if those mappings are ephemeral (i.e. if we get filesystems > persistently storing usernames, which is easy enough with xattrs) > then that just becomes "reserve x range of the 16 bit uid space for > ephemeral translations". *if* the user names you're dealing with are all unprivileged. When we have a mix of privileged and unprivileged users owning the files, the problems begin. James