On Wed, Oct 9, 2019 at 9:15 PM Alexei Starovoitov <ast@xxxxxxxxxx> wrote: > > Pointer to BTF object is a pointer to kernel object or NULL. > Such pointers can only be used by BPF_LDX instructions. > The verifier changed their opcode from LDX|MEM|size > to LDX|PROBE_MEM|size to make JITing easier. > The number of entries in extable is the number of BPF_LDX insns > that access kernel memory via "pointer to BTF type". > Only these load instructions can fault. > Since x86 extable is relative it has to be allocated in the same > memory region as JITed code. > Allocate it prior to last pass of JITing and let the last pass populate it. > Pointer to extable in bpf_prog_aux is necessary to make page fault > handling fast. > Page fault handling is done in two steps: > 1. bpf_prog_kallsyms_find() finds BPF program that page faulted. > It's done by walking rb tree. > 2. then extable for given bpf program is binary searched. > This process is similar to how page faulting is done for kernel modules. > The exception handler skips over faulting x86 instruction and > initializes destination register with zero. This mimics exact > behavior of bpf_probe_read (when probe_kernel_read faults dest is zeroed). > > JITs for other architectures can add support in similar way. > Until then they will reject unknown opcode and fallback to interpreter. > > Since extable should be aligned and placed near JITed code > make bpf_jit_binary_alloc() return 4 byte aligned image offset, > so that extable aligning formula in bpf_int_jit_compile() doesn't need > to rely on internal implementation of bpf_jit_binary_alloc(). > On x86 gcc defaults to 16-byte alignment for regular kernel functions > due to better performance. JITed code may be aligned to 16 in the future, > but it will use 4 in the meantime. > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > --- Acked-by: Andrii Nakryiko <andriin@xxxxxx> > arch/x86/net/bpf_jit_comp.c | 97 +++++++++++++++++++++++++++++++++++-- > include/linux/bpf.h | 3 ++ > include/linux/extable.h | 10 ++++ > kernel/bpf/core.c | 20 +++++++- > kernel/bpf/verifier.c | 1 + > kernel/extable.c | 2 + > 6 files changed, 128 insertions(+), 5 deletions(-) > [...]