On Wed, Feb 03, 2021 at 12:56:39AM +0000, Song Liu wrote: > > > > On Feb 1, 2021, at 9:38 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > PTR_TO_BTF_ID registers contain either kernel pointer or NULL. > > Emit the NULL check explicitly by JIT instead of going into > > do_user_addr_fault() on NULL deference. > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > --- > > arch/x86/net/bpf_jit_comp.c | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > index b7a2911bda77..a3dc3bd154ac 100644 > > --- a/arch/x86/net/bpf_jit_comp.c > > +++ b/arch/x86/net/bpf_jit_comp.c > > @@ -930,6 +930,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, > > u32 dst_reg = insn->dst_reg; > > u32 src_reg = insn->src_reg; > > u8 b2 = 0, b3 = 0; > > + u8 *start_of_ldx; > > s64 jmp_offset; > > u8 jmp_cond; > > u8 *func; > > @@ -1278,12 +1279,30 @@ st: if (is_imm8(insn->off)) > > case BPF_LDX | BPF_PROBE_MEM | BPF_W: > > case BPF_LDX | BPF_MEM | BPF_DW: > > case BPF_LDX | BPF_PROBE_MEM | BPF_DW: > > + if (BPF_MODE(insn->code) == BPF_PROBE_MEM) { > > + /* test src_reg, src_reg */ > > + maybe_emit_mod(&prog, src_reg, src_reg, true); /* always 1 byte */ > > + EMIT2(0x85, add_2reg(0xC0, src_reg, src_reg)); > > + /* jne start_of_ldx */ > > + EMIT2(X86_JNE, 0); > > + /* xor dst_reg, dst_reg */ > > + emit_mov_imm32(&prog, false, dst_reg, 0); > > + /* jmp byte_after_ldx */ > > + EMIT2(0xEB, 0); > > + > > + /* populate jmp_offset for JNE above */ > > + temp[4] = prog - temp - 5 /* sizeof(test + jne) */; > > IIUC, this case only happens for i == 1 in the loop? If so, can we use temp[5(?)] > instead of start_of_ldx? I don't understand the question, but let me try anyway :) temp is a buffer for single instruction. prog=temp; for every loop iteration (not only i == 1) temp[4] is second byte in JNE instruction as the comment says. temp[5] is a byte after JNE. It's a first byte of XOR. That XOR is variable length instruction. Hence while emitting JNE we don't know the target offset in JNE and just use 0. So temp[4] assignment populates with actual offset, since now we know the size of XOR.