On Tue, Mar 7, 2023 at 11:16 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Fri, Mar 03, 2023 at 11:13:51PM +0800, Gao Xiang wrote: > > Hi Alexander, > > > > On 2023/3/3 21:57, Alexander Larsson wrote: > > > On Mon, Feb 27, 2023 at 10:22 AM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > > But I know for the people who are more interested in using composefs > > > for containers the eventual goal of rootless support is very > > > important. So, on behalf of them I guess the question is: Is there > > > ever any chance that something like composefs could work rootlessly? > > > Or conversely: Is there some way to get rootless support from the > > > overlay approach? Opinions? Ideas? > > > > Honestly, I do want to get a proper answer when Giuseppe asked me > > the same question. My current view is simply "that question is > > almost the same for all in-kernel fses with some on-disk format". > > As far as I'm concerned filesystems with on-disk format will not be made > mountable by unprivileged containers. And I don't think I'm alone in > that view. The idea that ever more parts of the kernel with a massive > attack surface such as a filesystem need to vouchesafe for the safety in > the face of every rando having access to > unshare --mount --user --map-root is a dead end and will just end up > trapping us in a neverending cycle of security bugs (Because every > single bug that's found after making that fs mountable from an > unprivileged container will be treated as a security bug no matter if > justified or not. So this is also a good way to ruin your filesystem's > reputation.). > > And honestly, if we set the precedent that it's fine for one filesystem > with an on-disk format to be able to be mounted by unprivileged > containers then other filesystems eventually want to do this as well. > > At the rate we currently add filesystems that's just a matter of time > even if none of the existing ones would also want to do it. And then > we're left arguing that this was just an exception for one super > special, super safe, unexploitable filesystem with an on-disk format. > > Imho, none of this is appealing. I don't want to slowly keep building a > future where we end up running fuzzers in unprivileged container to > generate random images to crash the kernel. > > I have more arguments why I don't think is a path we will ever go down > but I don't want this to detract from the legitimate ask of making it > possible to mount trusted images from within unprivileged containers. > Because I think that's perfectly legitimate. > > However, I don't think that this is something the kernel needs to solve > other than providing the necessary infrastructure so that this can be > solved in userspace. So, I completely understand this point of view. And, since I'm not really hearing any other viewpoint from the linux vfs developers it seems to be a shared opinion. So, it seems like further work on the kernel side of composefs isn't really useful anymore, and I will focus my work on the overlayfs side. Maybe we can even drop the summit topic to avoid a bunch of unnecessary travel? That said, even though I understand (and even agree) with your worries, I feel it is kind of unfortunate that we end up with (essentially) a setuid helper approach for this. Because it feels like we're giving up on a useful feature (trustless unprivileged mounts) that the kernel could *theoretically* deliver, but a setuid helper can't. Sure, if you have a closed system you can limit what images can get mounted to images signed by a trusted key, but it won't work well for things like user built images or publically available images. Unfortunately practicalities kinda outweigh theoretical advantages. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander Larsson Red Hat, Inc alexl@xxxxxxxxxx alexander.larsson@xxxxxxxxx