On Wed, Feb 23, 2022 at 4:11 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <haoluo@xxxxxxxxxx> wrote: > > > > For binaries that are statically linked, consecutive stack frames are > > likely to be in the same VMA and therefore have the same build id. > > As an optimization for this case, we can cache the previous frame's > > VMA, if the new frame has the same VMA as the previous one, reuse the > > previous one's build id. We are holding the MM locks as reader across > > the entire loop, so we don't need to worry about VMA going away. > > > > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in > > test_progs. > > > > Suggested-by: Greg Thelen <gthelen@xxxxxxxxxx> > > Signed-off-by: Hao Luo <haoluo@xxxxxxxxxx> > > --- > > LGTM. Can you share performance numbers before and after? > > Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > Thanks Andrii. On a real-world workload, we observed that 66% of cpu cycles in __bpf_get_stackid() were spent on build_id_parse() and find_vma(). This was before. We haven't evaluated the performance with this patch yet. This optimization seems straightforward, so we plan to upstream it first and then retest.