On Thu, Oct 29, 2020 at 11:37:23AM -0500, Eric W. Biederman wrote: > First and foremost: A uid shift on write to a filesystem is a security > bug waiting to happen. This is especially in the context of facilities > like iouring, that play very agressive games with how process context > makes it to system calls. > > The only reason containers were not immediately exploitable when iouring > was introduced is because the mechanisms are built so that even if > something escapes containment the security properties still apply. > Changes to the uid when writing to the filesystem does not have that > property. The tiniest slip in containment will be a security issue. > > This is not even the least bit theoretical. I have seem reports of how > shitfs+overlayfs created a situation where anyone could read > /etc/shadow. This bug was the result of a complex interaction with several contributing factors. It's fair to say that one component was overlayfs writing through an id-shifted mount, but the primary cause was related to how copy-up was done coupled with allowing unprivileged overlayfs mounts in a user ns. Checks that the mounter had access to the lower fs file were not done before copying data up, and so the file was copied up temporarily to the id shifted upperdir. Even though it was immediately removed, other factors made it possible for the user to get the file contents from the upperdir. Regardless, I do think you raise a good point. We need to be wary of any place the kernel could open files through a shifted mount, especially when the open could be influenced by userspace. Perhaps kernel file opens through shifted mounts should to be opt-in. I.e. unless a flag is passed, or a different open interface used, the open will fail if the dentry being opened is subject to id shifting. This way any kernel writes which would be subject to id shifting will only happen through code which as been written to take it into account. Seth