On Thu, Jul 23, 2020 at 05:13:28PM -0700, Nick Kralevich wrote: > On Thu, Jul 23, 2020 at 10:30 AM Lokesh Gidra <lokeshgidra@xxxxxxxxxx> wrote: > > From the discussion so far it seems that there is a consensus that > > patch 1/2 in this series should be upstreamed in any case. Is there > > anything that is pending on that patch? > > That's my reading of this thread too. > > > > > Unless I'm mistaken that you can already enforce bit 1 of the second > > > > parameter of the userfaultfd syscall to be set with seccomp-bpf, this > > > > would be more a question to the Android userland team. > > > > > > > > The question would be: does it ever happen that a seccomp filter isn't > > > > already applied to unprivileged software running without > > > > SYS_CAP_PTRACE capability? > > > > > > Yes. > > > > > > Android uses selinux as our primary sandboxing mechanism. We do use > > > seccomp on a few processes, but we have found that it has a > > > surprisingly high performance cost [1] on arm64 devices so turning it > > > on system wide is not a good option. > > > > > > [1] https://lore.kernel.org/linux-security-module/202006011116.3F7109A@keescook/T/#m82ace19539ac595682affabdf652c0ffa5d27dad > > As Jeff mentioned, seccomp is used strategically on Android, but is > not applied to all processes. It's too expensive and impractical when > simpler implementations (such as this sysctl) can exist. It's also > significantly simpler to test a sysctl value for correctness as > opposed to a seccomp filter. Given that selinux is already used system-wide on Android, what is wrong with using selinux to control userfaultfd as opposed to seccomp? > > > > > > > > > > > > If answer is "no" the behavior of the new sysctl in patch 2/2 (in > > > > subject) should be enforceable with minor changes to the BPF > > > > assembly. Otherwise it'd require more changes. > > It would be good to understand what these changes are. > > > > > Why exactly is it preferable to enlarge the surface of attack of the > > > > kernel and take the risk there is a real bug in userfaultfd code (not > > > > just a facilitation of exploiting some other kernel bug) that leads to > > > > a privilege escalation, when you still break 99% of userfaultfd users, > > > > if you set with option "2"? > > I can see your point if you think about the feature as a whole. > However, distributions (such as Android) have specialized knowledge of > their security environments, and may not want to support the typical > usages of userfaultfd. For such distributions, providing a mechanism > to prevent userfaultfd from being useful as an exploit primitive, > while still allowing the very limited use of userfaultfd for userspace > faults only, is desirable. Distributions shouldn't be forced into > supporting 100% of the use cases envisioned by userfaultfd when their > needs may be more specialized, and this sysctl knob empowers > distributions to make this choice for themselves. > > > > > Is the system owner really going to purely run on his systems CRIU > > > > postcopy live migration (which already runs with CAP_SYS_PTRACE) and > > > > nothing else that could break? > > This is a great example of a capability which a distribution may not > want to support, due to distribution specific security policies. > > > > > > > > > Option "2" to me looks with a single possible user, and incidentally > > > > this single user can already enforce model "2" by only tweaking its > > > > seccomp-bpf filters without applying 2/2. It'd be a bug if android > > > > apps runs unprotected by seccomp regardless of 2/2. > > Can you elaborate on what bug is present by processes being > unprotected by seccomp? > > Seccomp cannot be universally applied on Android due to previously > mentioned performance concerns. Seccomp is used in Android primarily > as a tool to enforce the list of allowed syscalls, so that such > syscalls can be audited before being included as part of the Android > API. > > -- Nick > > -- > Nick Kralevich | nnk@xxxxxxxxxx