On Tue, 1 Sep 2020 at 16:53, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > "Serge E. Hallyn" <serge@xxxxxxxxxx> writes: > > > On Fri, Aug 28, 2020 at 10:17:16AM -0500, Eric W. Biederman wrote: > >> > >> We had a discussion in the hackroom at LPC talking about use cases for > >> a shiftfs style setup where there are different mappings of uids to > >> disk. > >> > >> In the discussion we had a couple of ideas of kernel developments > >> we should look at that address some of these. > >> > >> - Fix rlimits in user namespaces (This potentially allows multiple > >> containers to run with the same userids simplifying the mapping > >> problem). > >> > >> - Look at extending kuid_t to 64bits and using the highbits to > >> implement uids that are private to user namespaces and don't > >> map out. > >> > >> - Look at ways for allowing setgroups unprivileged. > >> > >> > >> Together this has the potential that the existing uid & gid mappings > >> will be able to function the same as the proposed fusid mappings. Fingers crossed. > >> > >> > >> I had some problems with audio and a lot of people were talking > >> quickly. So I did not manage to capture everyone's use cases. And I > >> definitely was not able to see how everyone's use cases interacted with > >> the changes we are looking at. > >> > >> I know for certain I missed Serge's usecase (apologies). > >> > >> Can people follow up to this and report their use cases? > > > > Sorry - I'll do so later this week. > > Thank you. > > I know we have the OCI use case of overlayfs and sharing storage > between containers. Is there a description of this use case, and ideas of possible solutions? There was a use case that was brought to a recent Kubernetes sig-node meeting, but I am not sure it is the same one, so I'll describe this one. We're working to enable user namespaces in Kubernetes with two possible setups: 1. Each Kubernetes pod has its own userns but with the same user id mapping (to make it easy to manage shared volumes) 2. Each Kubernetes pod has its own userns with non-overlapping user id mapping (providing additional isolation between pods) Pods could be executed with a Kubernetes sidecar such as Envoy, meaning that if there are 'n' pods running on a node, there will be 'n' containers running the Envoy container image. In the second setup, each Envoy container will have a different user id mapping. The Envoy container image will be snapshotted in as many copies, and each copy will be chown'ed appropriately for each non-overlapping user id mapping. Each copy will then be used as the lowerdir for the overlayfs rootfs of the container. This results in a waste of disk space. Without user namespaces, we don't have this problem, since the same directory can be used as lowerdir for each container's overlayfs rootfs. This is not specific to sidecars, but they exacerbate the problem. I was not present in the Plumbers discussions, so I don't know if possible solutions were discussed. For example: could overlayfs get a mount option to specify an id mapping to apply on the lowerdir? In the case of containerd + runc, the mount option would probably need to be an id mapping rather than a reference to an existing user namespace, because the overlayfs mount is set up by containerd before runc is executed, so before the user namespace is unshared. Alban > I know we have the lxc case of not wanting to be strangled by ulimits. > So not using the same uid between containers even when it is logically > the same users. > > I know the brainstorming was going a lot of different directions and I > piped up and said that we should probably focus on handling the stranger > cases with fuse mounts, and the other capabilities we have now. > > It really will be valuable to understand the other cases so we don't > code ourselves into a corner that only works for the most vocal of the > developers. > > Eric _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers