On Tue, Feb 18, 2020 at 03:50:56PM -0800, James Bottomley wrote: > On Tue, 2020-02-18 at 15:33 +0100, Christian Brauner wrote: > > In the usual case of running an unprivileged container we will have > > setup an id mapping, e.g. 0 100000 100000. The on-disk mapping will > > correspond to this id mapping, i.e. all files which we want to appear > > as 0:0 inside the user namespace will be chowned to 100000:100000 on > > the host. This works, because whenever the kernel needs to do a > > filesystem access it will lookup the corresponding uid and gid in the > > idmapping tables of the container. Now think about the case where we > > want to have an id mapping of 0 100000 100000 but an on-disk mapping > > of 0 300000 100000 which is needed to e.g. share a single on-disk > > mapping with multiple containers that all have different id mappings. > > This will be problematic. Whenever a filesystem access is requested, > > the kernel will now try to lookup a mapping for 300000 in the id > > mapping tables of the user namespace but since there is none the > > files will appear to be owned by the overflow id, i.e. usually > > 65534:65534 or nobody:nogroup. > > > > With fsid mappings we can solve this by writing an id mapping of 0 > > 100000 100000 and an fsid mapping of 0 300000 100000. On filesystem > > access the kernel will now lookup the mapping for 300000 in the fsid > > mapping tables of the user namespace. And since such a mapping > > exists, the corresponding files will have correct ownership. > > So I did compile this up in order to run the shiftfs tests over it to > see how it coped with the various corner cases. However, what I find > is it simply fails the fsid reverse mapping in the setup. Trying to > use a simple uid of 0 100000 1000 and a fsid of 100000 0 1000 fails the > entry setuid(0) call because of this code: This is easy to fix. But what's the exact use-case?