On Sun, Nov 12, 2023 at 3:14 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Fri, Nov 10, 2023 at 12:06:14PM -0500, Charles Mirabile wrote: > > This is a one line change that makes `linkat` aware of namespaces when > > checking for capabilities. > > > > As far as I can tell, the call to `capable` in this code dates back to > > before the `ns_capable` function existed, so I don't think the author > > specifically intended to prefer regular `capable` over `ns_capable`, > > and no one has noticed or cared to change it yet... until now! > > > > It is already hard enough to use `linkat` to link temporarily files > > into the filesystem without the `/proc` workaround, and when moving > > a program that was working fine on bare metal into a container, > > I got hung up on this additional snag due to the lack of namespace > > awareness in `linkat`. > > I agree that it would be nice to relax this a bit to make this play > nicer with containers. > > The current checks want to restrict scenarios where an application is > able to create a hardlink for an arbitrary file descriptor it has > received via e.g., SCM_RIGHTS or that it has inherited. Makes sense. > > So we want to somehow get a good enough approximation to the question > whether the caller would have been able to open the source file. > > When we check for CAP_DAC_READ_SEARCH in the caller's namespace we > presuppose that the file is accessible in the current namespace and that > CAP_DAC_READ_SEARCH would have been enough to open it. Both aren't > necessarily true. Neither need the file be accessible, e.g., due to a > chroot or pivot_root nor need CAP_DAC_READ_SEARCH be enough. For > example, the file could be accessible in the caller's namespace but due > to uid/gid mapping the {g,u}id of the file doesn't have a mapping in the > caller's namespace. So that doesn't really cut it imho. Good point. > > However, if we check for CAP_DAC_READ_SEARCH in the namespace the file > was opened in that could work. We know that the file must've been > accessible in the namespace the file was opened in and we > know that the {g,u}id of the file must have been mapped in the namespace > the file was opened in. So if we check that the caller does have > CAP_DAC_READ_SEARCH in the opener's namespace we can approximate that > the caller could've opened the file. Would that be the namespace pointed to by `->f_cred->user_ns` on the struct file corresponding to the fd? If so is there a better way to surface that struct file for checking than this? error=-ENOENT; if(flags & AT_EMPTY_PATH && !old->name[0]) { struct file *file = fget(oldfd); bool capable = ns_capable(file->f_cred->user_ns, CAP_DAC_READ_SEARCH); fput(file); if(!capable) goto out_putnames; } > > So that should allow us to enabled this for containers. > Best - Charlie