On Fri, Nov 10, 2023 at 12:06:14PM -0500, Charles Mirabile wrote: > This is a one line change that makes `linkat` aware of namespaces when > checking for capabilities. > > As far as I can tell, the call to `capable` in this code dates back to > before the `ns_capable` function existed, so I don't think the author > specifically intended to prefer regular `capable` over `ns_capable`, > and no one has noticed or cared to change it yet... until now! > > It is already hard enough to use `linkat` to link temporarily files > into the filesystem without the `/proc` workaround, and when moving > a program that was working fine on bare metal into a container, > I got hung up on this additional snag due to the lack of namespace > awareness in `linkat`. I agree that it would be nice to relax this a bit to make this play nicer with containers. The current checks want to restrict scenarios where an application is able to create a hardlink for an arbitrary file descriptor it has received via e.g., SCM_RIGHTS or that it has inherited. So we want to somehow get a good enough approximation to the question whether the caller would have been able to open the source file. When we check for CAP_DAC_READ_SEARCH in the caller's namespace we presuppose that the file is accessible in the current namespace and that CAP_DAC_READ_SEARCH would have been enough to open it. Both aren't necessarily true. Neither need the file be accessible, e.g., due to a chroot or pivot_root nor need CAP_DAC_READ_SEARCH be enough. For example, the file could be accessible in the caller's namespace but due to uid/gid mapping the {g,u}id of the file doesn't have a mapping in the caller's namespace. So that doesn't really cut it imho. However, if we check for CAP_DAC_READ_SEARCH in the namespace the file was opened in that could work. We know that the file must've been accessible in the namespace the file was opened in and we know that the {g,u}id of the file must have been mapped in the namespace the file was opened in. So if we check that the caller does have CAP_DAC_READ_SEARCH in the opener's namespace we can approximate that the caller could've opened the file. So that should allow us to enabled this for containers.