On Wed, Jun 19, 2024 at 2:22 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > Currently, on x86, when SMAP is enabled, and a page fault occurs in > kernel mode for accessing a user address, the kernel will rightly panic > as no valid kernel code can cause such a page fault (unless buggy). > There is no valid correct kernel code that can generate such a fault, > therefore this behavior would be correct. > > BPF programs that currently encounter user addresses when doing > PROBE_MEM loads (load instructions which are allowed to read any kernel > address, only available for root users) avoid a page fault by performing > bounds checking on the address. This requires the JIT to emit a jump > over each PROBE_MEM load instruction to avoid hitting page faults. > > We would prefer avoiding these jump instructions to improve performance > of programs which use PROBE_MEM loads pervasively. For correct behavior, > programs already rely on the kernel addresses being valid when they are > executing, but BPF's safety properties must still ensure kernel safety > in presence of invalid addresses. Therefore, for correct programs, the > bounds checking is an added cost meant to ensure kernel safety. If the > do_user_addr_fault handler could perform fixups for the BPF program in > such a case, the bounds checking could be eliminated, the load > instruction could be emitted directly without any checking. > > Thus, in case SMAP is enabled (which would mean the kernel traps on > accessing a user address), and the instruction pointer belongs to a BPF > program, perform fixup for the access by searching exception tables. > All BPF programs already execute with SMAP protection. When SMAP is not > enabled, the BPF JIT will continue to emit bounds checking instructions. > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > --- > arch/x86/mm/fault.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index e6c469b323cc..189e93d88bd4 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -21,6 +21,7 @@ > #include <linux/mm_types.h> > #include <linux/mm.h> /* find_and_lock_vma() */ > #include <linux/vmalloc.h> > +#include <linux/filter.h> /* is_bpf_text_address() */ > > #include <asm/cpufeature.h> /* boot_cpu_has, ... */ > #include <asm/traps.h> /* dotraplinkage, ... */ > @@ -1257,6 +1258,16 @@ void do_user_addr_fault(struct pt_regs *regs, > if (unlikely(cpu_feature_enabled(X86_FEATURE_SMAP) && > !(error_code & X86_PF_USER) && > !(regs->flags & X86_EFLAGS_AC))) { > + /* > + * If the kernel access happened to an invalid user pointer > + * under SMAP by a BPF program, we will have an extable entry > + * here, and need to perform the fixup. > + */ > + if (is_bpf_text_address(regs->ip)) { > + kernelmode_fixup_or_oops(regs, error_code, address, > + 0, 0, ARCH_DEFAULT_PKEY); > + return; > + } I see no harm doing this check here. Looks correct. Andy, we've talked about this idea in the past. Any objections to this optimization?