Commit-ID: c7b6f29b6257532792fc722b68fcc0e00b5a856c Gitweb: https://git.kernel.org/tip/c7b6f29b6257532792fc722b68fcc0e00b5a856c Author: Nadav Amit <namit@xxxxxxxxxx> AuthorDate: Thu, 25 Apr 2019 17:11:43 -0700 Committer: Ingo Molnar <mingo@xxxxxxxxxx> CommitDate: Tue, 30 Apr 2019 12:37:48 +0200 bpf: Fail bpf_probe_write_user() while mm is switched When using a temporary mm, bpf_probe_write_user() should not be able to write to user memory, since user memory addresses may be used to map kernel memory. Detect these cases and fail bpf_probe_write_user() in such cases. Suggested-by: Jann Horn <jannh@xxxxxxxxxx> Reported-by: Jann Horn <jannh@xxxxxxxxxx> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Cc: <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: <ard.biesheuvel@xxxxxxxxxx> Cc: <deneen.t.dock@xxxxxxxxx> Cc: <kernel-hardening@xxxxxxxxxxxxxxxxxx> Cc: <kristen@xxxxxxxxxxxxxxx> Cc: <linux_dti@xxxxxxxxxx> Cc: <will.deacon@xxxxxxx> Cc: Alexei Starovoitov <ast@xxxxxxxxxx> Cc: Andy Lutomirski <luto@xxxxxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Cc: H. Peter Anvin <hpa@xxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Link: https://lkml.kernel.org/r/20190426001143.4983-24-namit@xxxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> --- kernel/trace/bpf_trace.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index d64c00afceb5..94b0e37d90ef 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -14,6 +14,8 @@ #include <linux/syscalls.h> #include <linux/error-injection.h> +#include <asm/tlb.h> + #include "trace_probe.h" #include "trace.h" @@ -163,6 +165,10 @@ BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src, * access_ok() should prevent writing to non-user memory, but in * some situations (nommu, temporary switch, etc) access_ok() does * not provide enough validation, hence the check on KERNEL_DS. + * + * nmi_uaccess_okay() ensures the probe is not run in an interim + * state, when the task or mm are switched. This is specifically + * required to prevent the use of temporary mm. */ if (unlikely(in_interrupt() || @@ -170,6 +176,8 @@ BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src, return -EPERM; if (unlikely(uaccess_kernel())) return -EPERM; + if (unlikely(!nmi_uaccess_okay())) + return -EPERM; if (!access_ok(unsafe_ptr, size)) return -EPERM;