On Wed, Feb 19, 2020 at 09:18:51AM -0800, Andy Lutomirski wrote: > On Wed, Feb 19, 2020 at 4:06 AM Christian Brauner > <christian.brauner@xxxxxxxxxx> wrote: > > > > On Tue, Feb 18, 2020 at 08:42:33PM -0600, Serge Hallyn wrote: > > > On Tue, Feb 18, 2020 at 03:33:55PM +0100, Christian Brauner wrote: > > > > Introduce a helper which makes it possible to detect fileystems whose > > > > superblock is visible in multiple user namespace. This currently only > > > > means proc and sys. Such filesystems usually have special semantics so their > > > > behavior will not be changed with the introduction of fsid mappings. > > > > > > Hi, > > > > > > I'm afraid I've got a bit of a hangup about the terminology here. I > > > *think* what you mean is that SB_I_USERNS_VISIBLE is an fs whose uids are > > > always translated per the id mappings, not fsid mappings. But when I see > > > > Correct! > > > > > the name it seems to imply that !SB_I_USERNS_VISIBLE filesystems can't > > > be seen by other namespaces at all. > > > > > > Am I right in my first interpretation? If so, can we talk about the > > > naming? > > > > Yep, your first interpretation is right. What about: wants_idmaps() > > Maybe fsidmap_exempt()? Yeah, and maybe SB_USERNS_FSID_EXEMPT ? > I still haven't convinced myself that any of the above is actually > correct behavior, especially when people do things like creating > setuid binaries. The only place that would be a problem is if the child userns has an fsidmapping from X to 0 in the parent userns, right? Yeah I'm sure many people would ignore all advice to the contrary and do this anyway, but I would try hard to suggest that people use an intermediary userns for storing filesystems for the "docker share" case. So the host fsid range would start at say 200000. So a setuid binary would just be setuid-200000.