So my essential point is that building the real kuid into the permanent
record of the xattr damages image portability, which is touted as one
of the real advantages of container images.
'container images' aren't portable in that sense now - for at least
many cases - because you have to shift the uid. However you're doing
that, you may be able to shift the xattr the same way.
Piling more things on top of that issue isn't going to make the issue easier to
solve IMO. Would shiftfs or shift-bindmounts also have to do translation of
arbitrary xattrs? Plus I would think that handling xattrs would be harder than
{u,g}ids because the image unpacker now has to be aware of all xattrs that
require remapping (Which might be an ever-growing list).
The user namespace incompatibility with the VFS's hard-coding of k{u,g}id values
in inodes is an issue that we really shouldn't be encouraging IMO [especially
given how hard it's been so far to solve that problem.]
There is one very simple solution to the problem.
Perform the unpacking in your user namespace.
I'm not aware of any major container runtime that couples image
unpacking to the runtime components. Docker hasn't done it for years
(it's split between runc and Docker/containerd). rkt hasn't ever done it
(runtime stages are totally separate to image unpacking). cri-o doesn't
do it either. I believe that only singularity does something like that
(though singularity is also not actually a "container runtime" in the
modern meaning of the term).
Not to mention that the OCI standards explicitly separate the two
concepts, and there exist tools to manipulate images that don't
explicitly use containers (or namespaces for that matter) either[1].
The reason Docker doesn't do that is they want to share files and images
between different containers. That sharing when we are talking about
different privilege domains and persistent storage is a challenge.
Hopefully shiftfs can solve that challenge.
Yes, I'm aware of that -- though claiming it's purely a Docker problem
isn't really fair (it's a problem of any container runtime that wants to
effectively use overlay filesystems). If shiftfs is going to solve the
sharing problem for xattrs as well, then I don't have any complaints
other than "it sucks that we have to add more magical translation to a
still-not-merged shiftfs".
But if you say there's not a nicer way to handle this problem, then
that's good enough for me. :D
[1]: https://github.com/openSUSE/umoci
--
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers