On Thu, Mar 21, 2024 at 11:05:00AM -0700, Andrii Nakryiko wrote: > Inline bpf_get_branch_snapshot() helper using architecture-agnostic > inline BPF code which calls directly into underlying callback of > perf_snapshot_branch_stack static call. This callback is set early > during kernel initialization and is never updated or reset, so it's ok > to fetch actual implementation using static_call_query() and call > directly into it. > > This change eliminates a full function call and saves one LBR entry > in PERF_SAMPLE_BRANCH_ANY LBR mode. > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > --- > kernel/bpf/verifier.c | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index de7813947981..4fb6c468e199 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -20130,6 +20130,43 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > goto next_insn; > } > > + /* Implement bpf_get_branch_snapshot inline. */ > + if (prog->jit_requested && BITS_PER_LONG == 64 && > + insn->imm == BPF_FUNC_get_branch_snapshot) { > + /* We are dealing with the following func protos: > + * u64 bpf_get_branch_snapshot(void *buf, u32 size, u64 flags); > + * int perf_snapshot_branch_stack(struct perf_branch_entry *entries, u32 cnt); > + */ > + const u32 br_entry_size = sizeof(struct perf_branch_entry); > + > + /* if (unlikely(flags)) return -EINVAL */ > + insn_buf[0] = BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 0, 5); nit, you moved the flags check on top, which I think makes sense and we should do it in bpf_get_branch_snapshot as well to keep it same jirka > + /* transform size (bytes) into entry_cnt */ > + insn_buf[1] = BPF_ALU32_IMM(BPF_DIV, BPF_REG_2, br_entry_size); > + /* call perf_snapshot_branch_stack implementation */ > + insn_buf[2] = BPF_EMIT_CALL(static_call_query(perf_snapshot_branch_stack)); > + /* if (entry_cnt == 0) return -ENOENT */ > + insn_buf[3] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4); > + /* return entry_cnt * sizeof(struct perf_branch_entry) */ > + insn_buf[4] = BPF_ALU32_IMM(BPF_MUL, BPF_REG_0, br_entry_size); > + insn_buf[5] = BPF_JMP_A(3); > + /* return -EINVAL; */ > + insn_buf[6] = BPF_MOV64_IMM(BPF_REG_0, -EINVAL); > + insn_buf[7] = BPF_JMP_A(1); > + /* return -ENOENT; */ > + insn_buf[8] = BPF_MOV64_IMM(BPF_REG_0, -ENOENT); > + cnt = 9; > + > + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); > + if (!new_prog) > + return -ENOMEM; > + > + delta += cnt - 1; > + env->prog = prog = new_prog; > + insn = new_prog->insnsi + i + delta; > + continue; > + } > + > /* Implement bpf_kptr_xchg inline */ > if (prog->jit_requested && BITS_PER_LONG == 64 && > insn->imm == BPF_FUNC_kptr_xchg && > -- > 2.43.0 > >