On Tue, Nov 05, 2019 at 08:00:26AM -0800, Andy Lutomirski wrote: > On Tue, Nov 5, 2019 at 7:55 AM Daniel Colascione <dancol@xxxxxxxxxx> wrote: > > > > On Tue, Nov 5, 2019 at 7:29 AM Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote: > > > > > > Current implementation of UFFD_FEATURE_EVENT_FORK modifies the file > > > descriptor table from the read() implementation of uffd, which may have > > > security implications for unprivileged use of the userfaultfd. > > > > > > Limit availability of UFFD_FEATURE_EVENT_FORK only for callers that have > > > CAP_SYS_PTRACE. > > > > Thanks. But shouldn't we be doing the capability check at > > userfaultfd(2) time (when we do the other permission checks), not > > later, in the API ioctl? > > The ioctl seems reasonable to me. In particular, if there is anyone > who creates a userfaultfd as root and then drop permissions, a later > ioctl could unexpectedly enable FORK. > > This assumes that the code in question is only reachable through > ioctl() and not write(). write isn't implemented. Until UFFDIO_API runs, all other implemented syscalls are disabled (i.e. all other ioctls, poll and read). You can quickly verify all the 3 blocks by searching for UFFD_STATE_WAIT_API, UFFDIO_API is the place where the handshake with userland happens. userland asks for certain features and the kernel implementation of userlands answers yes or no. Normally we would only ever return -EINVAL on a request of a feature that isn't available in the running kernel (equivalent to -ENOSYS if the syscall is entirely missing on an even older kernel), -EPERM is more informative as it tells userland the feature is actually in the kernel just it requires more permissions. We could have returned -EINVAL too, but it wouldn't have made a difference to non-privileged CRIU and we're not aware of other users that could benefit from -EINVAL instead of -EPERM. This the relevant CRIU userland: if (ioctl(uffd, UFFDIO_API, &uffdio_api)) { pr_perror("Failed to get uffd API"); goto err; } Unfortunately this is an ABI break, preferred than the clean removal of the feature, because it's at least not going to break CRIU deployments running with the PTRACE privilege. The clean removal while non-ABI breaking, would have prevented all CRIU users to keep running after a kernel upgrade. The long term plan is to introduce UFFD_FEATURE_EVENT_FORK2 feature flag that uses the ioctl to receive the child uffd, it'll consume more CPU, but it wouldn't require the PTRACE privilege anymore. Overall any suid or SCM_RIGHTS fd-receiving app, that isn't checking the retval of open/socket or whatever fd "installing" syscall, is non robust and is prone to break over time as more people edit the code or as any library call internally change behavior, so if there's any practical issue caused by this, it should be fixed in userland too for higher robustness. If you stick your userland to std::fs and std::net robustness against issues like this is enforced by the language. Thanks, Andrea