> With my kernel hat on, maybe I agree. But with my *user* hat on, I > think I pretty strongly disagree. Look, idmapis lousy for > unprivileged use: > > $ install -m 0700 -d test_directory > $ echo 'hi there' >test_directory/file > $ podman run -it --rm > --mount=type=bind,src=test_directory,dst=/tmp,idmap [debian-slim] $ podman run -it --rm --mount=type=bind,src=test_directory,dst=/tmp,idmap [debian-slim] as an unprivileged user doesn't use idmapped mounts at all. So I'm not sure what this is showing. I suppose you're talking about idmaps in general. > # cat /tmp/file > hi there > > <-- Hey, look, this kind of works! > > # setpriv --reuid=1 ls /tmp > ls: cannot open directory '/tmp': Permission denied > > <-- Gee, thanks, Linux! > > > Obviously this is a made up example. But it's quite analogous to a > real example. Suppose I want to make a directory that will contain > some MySQL data. I don't want to share this directory with anyone > else, so I set its mode to 0700. Then I want to fire up an > unprivileged MySQL container, so I build or download it, and then I > run it and bind my directory to /var/lib/mysql and I run it. I don't > need to think about UIDs or anything because it's 2024 and containers > just work. Okay, I need to setenforce 0 because I'm on Fedora and > SELinux makes absolutely no sense in a container world, but I can live > with that. > > Except that it doesn't work! Because unless I want to manually futz > with the idmaps to get mysql to have access to the directory inside > the container, only *root* gets to get in. But I bet that even > futzing with the idmap doesn't work, because software like mysql often > expects that root *and* a user can access data. And some software > even does privilege separation and uses more than one UID. If the directory is 700 and it's owned by say root:root on the host and you want to share that with arbitrary container users then this isn't something you can do today (ignoring group permissions and ACLs for the sake of your argument) even on the host so that's not a limitation of userns or idmapped mounts. That means many to one mappings of uids/gids. > So I want a way to give *an entire container* access to a directory. > Classic UNIX DAC is just *wrong* for this use case. Maybe idmaps > could learn a way to squash multiple ids down to one. Or maybe Many idmappings to one is in principle possible and I've noted that idea down as a possible extension at https://github.com/uapi-group/kernel-features quite a while (2 years?) ago. > I haven't looked at the idmap implementation nearly enough to have any > opinion as to whether squashing UID is practical or whether there's It's doable. The interesting bit to me was that if we want to allow writes we'd need a way to determine what the uid/gid would be to write down. Imho, that's not super difficult to solve though. The most obvious one is that userspace can just determine it when creating the idmapped mount.