Currently, on x86, when SMAP is enabled, and a page fault occurs in kernel mode for accessing a user address, the kernel will rightly panic as no valid kernel code can cause such a page fault (unless buggy). There is no valid correct kernel code that can generate such a fault, therefore this behavior would be correct. BPF programs that currently encounter user addresses when doing PROBE_MEM loads (load instructions which are allowed to read any kernel address, only available for root users) avoid a page fault by performing bounds checking on the address. This requires the JIT to emit a jump over each PROBE_MEM load instruction to avoid hitting page faults. We would prefer avoiding these jump instructions to improve performance of programs which use PROBE_MEM loads pervasively. For correct behavior, programs already rely on the kernel addresses being valid when they are executing, but BPF's safety properties must still ensure kernel safety in presence of invalid addresses. Therefore, for correct programs, the bounds checking is an added cost meant to ensure kernel safety. If the do_user_addr_fault handler could perform fixups for the BPF program in such a case, the bounds checking could be eliminated, the load instruction could be emitted directly without any checking. Thus, in case SMAP is enabled (which would mean the kernel traps on accessing a user address), and the instruction pointer belongs to a BPF program, perform fixup for the access by searching exception tables. All BPF programs already execute with SMAP protection. When SMAP is not enabled, the BPF JIT will continue to emit bounds checking instructions. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> --- arch/x86/mm/fault.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e6c469b323cc..189e93d88bd4 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -21,6 +21,7 @@ #include <linux/mm_types.h> #include <linux/mm.h> /* find_and_lock_vma() */ #include <linux/vmalloc.h> +#include <linux/filter.h> /* is_bpf_text_address() */ #include <asm/cpufeature.h> /* boot_cpu_has, ... */ #include <asm/traps.h> /* dotraplinkage, ... */ @@ -1257,6 +1258,16 @@ void do_user_addr_fault(struct pt_regs *regs, if (unlikely(cpu_feature_enabled(X86_FEATURE_SMAP) && !(error_code & X86_PF_USER) && !(regs->flags & X86_EFLAGS_AC))) { + /* + * If the kernel access happened to an invalid user pointer + * under SMAP by a BPF program, we will have an extable entry + * here, and need to perform the fixup. + */ + if (is_bpf_text_address(regs->ip)) { + kernelmode_fixup_or_oops(regs, error_code, address, + 0, 0, ARCH_DEFAULT_PKEY); + return; + } /* * No extable entry here. This was a kernel access to an * invalid pointer. get_kernel_nofault() will not get here. -- 2.43.0