On Tue, Mar 12, 2019 at 09:01:47AM +0200, Mike Rapoport wrote: > Hi Peter, > > On Mon, Mar 11, 2019 at 05:36:58PM +0800, Peter Xu wrote: > > Hi, > > > > (The idea comes from Andrea, and following discussions with Mike and > > other people) > > > > This patchset introduces a new sysctl flag to allow the admin to > > forbid users from using userfaultfd: > > > > $ cat /proc/sys/vm/unprivileged_userfaultfd > > [disabled] enabled kvm > > > > - When set to "disabled", all unprivileged users are forbidden to > > use userfaultfd syscalls. > > > > - When set to "enabled", all users are allowed to use userfaultfd > > syscalls. > > > > - When set to "kvm", all unprivileged users are forbidden to use the > > userfaultfd syscalls, except the user who has permission to open > > /dev/kvm. > > > > This new flag can add one more layer of security to reduce the attack > > surface of the kernel by abusing userfaultfd. Here we grant the > > thread userfaultfd permission by checking against CAP_SYS_PTRACE > > capability. By default, the value is "disabled" which is the most > > strict policy. Distributions can have their own perferred value. > > > > The "kvm" entry is a bit special here only to make sure that existing > > users like QEMU/KVM won't break by this newly introduced flag. What > > we need to do is simply set the "unprivileged_userfaultfd" flag to > > "kvm" here to automatically grant userfaultfd permission for processes > > like QEMU/KVM without extra code to tweak these flags in the admin > > code. > > > > Patch 1: The interface patch to introduce the flag > > > > Patch 2: The KVM related changes to detect opening of /dev/kvm > > > > Patch 3: Apply the flag to userfaultfd syscalls > > I'd appreciate to see "Patch 4: documentation update" ;-) > It'd be also great to update the man pages after this is merged. Oops, sorry! I should have remembered that. > > Except for the comment to patch 1, feel free to add > > Reviewed-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> Thanks Mike! I'll take it for 2/3 until I got confirmation from you on patch 1. Regards, -- Peter Xu