On Fri, Sep 10, 2021 at 06:27:36PM +0000, Song Liu wrote: > This works great and saves 3 entries! We have the following now: Yay! > ID: 0 from bpf_get_branch_snapshot+18 to intel_pmu_snapshot_branch_stack+0 is unavoidable, we need to end up in intel_pmu_snapshot_branch_stack() eventually. > ID: 1 from __brk_limit+477143934 to bpf_get_branch_snapshot+0 could be elided by having the JIT emit the call to intel_pmu_snapshot_branch_stack directly, instead of laundering it through that helper I suppose. > ID: 2 from __brk_limit+477192263 to __brk_limit+477143880 # trampoline > ID: 3 from __bpf_prog_enter+34 to __brk_limit+477192251 -ENOCLUE > ID: 4 from migrate_disable+60 to __bpf_prog_enter+9 > ID: 5 from __bpf_prog_enter+4 to migrate_disable+0 I suppose we can reduce that to a single branch if we inline migrate_disable() here, that thing unfortunately needs one branch itself. > ID: 6 from bpf_testmod_loop_test+20 to __bpf_prog_enter+0 And this is the first branch out of the test program, giving 7 entries now, of which we can remove at least 2 more with a bit of elbow greace, right? > ID: 7 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > ID: 8 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > > I will fold this in and send v7. Excellent.