On Tue, 2 Apr 2024 22:21:00 -0700 Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > I just checked our fleet-wide production data for the last 24 hours. > > Within the kprobe/kretprobe code path (ftrace_trampoline and > > everything called from it), rcu_is_watching (both calls, see below) > > cause 0.484% CPU cycles usage, which isn't nothing. So definitely we'd > > prefer to be able to avoid that in production use cases. > > > > I just ran synthetic microbenchmark testing multi-kretprobe > throughput. We get (in millions of BPF kretprobe-multi program > invocations per second): > - 5.568M/s as baseline; > - 5.679M/s with changes in this patch (+2% throughput improvement); > - 5.808M/s with disabling rcu_is_watching in rethook_try_get() > (+2.3% more vs just one of rcu_is_watching, and +4.3% vs baseline). > > It's definitely noticeable. Ah, thanks for verifying (I should have read the thread before replying to the previous email). -- Steve