> On Sep 3, 2021, at 1:47 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Sep 02, 2021 at 09:57:05AM -0700, Song Liu wrote: >> +BPF_CALL_3(bpf_get_branch_snapshot, void *, buf, u32, size, u64, flags) >> +{ >> + static const u32 br_entry_size = sizeof(struct perf_branch_entry); >> + u32 entry_cnt = size / br_entry_size; >> + >> + if (unlikely(flags)) >> + return -EINVAL; >> + >> + if (!buf || (size % br_entry_size != 0)) >> + return -EINVAL; >> + >> + entry_cnt = static_call(perf_snapshot_branch_stack)(buf, entry_cnt); > > That's at least 2, possibly 3 branches just from the sanity checks, plus > at least one from starting the BPF prog and one from calling this > function, gets you at ~5 branch entries gone before you even do the > snapshot thing. Let me try to shuffle the function and get rid of some of these checks. > > Less if you're in branch-stack mode. > > Can't the validator help with getting rid of the some of that? > > I suppose you have to have this helper function because the JIT cannot > emit static_call()... although in this case one could cheat and simply > emit a call to static_call_query() and not bother with dynamic updates > (because there aren't any). We only JIT some key helper functions. I didn't think about that because current version is OK for mainstream and future hardware. I guess we can try JIT if it turns out some architecture needs more optimization. Thanks, Song