On Sun, Apr 28, 2024 at 4:25 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Thu, 25 Apr 2024 13:31:53 -0700 > Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > I'm just coming back from Japan (work and then a vacation), and > catching up on my email during the 6 hour layover in Detroit. > > > Hey Masami, > > > > I can't really review most of that code as I'm completely unfamiliar > > with all those inner workings of fprobe/ftrace/function_graph. I left > > a few comments where there were somewhat more obvious BPF-related > > pieces. > > > > But I also did run our BPF benchmarks on probes/for-next as a baseline > > and then with your series applied on top. Just to see if there are any > > regressions. I think it will be a useful data point for you. > > > > You should be already familiar with the bench tool we have in BPF > > selftests (I used it on some other patches for your tree). > > I should get familiar with your tools too. > It's a nifty and self-contained tool to do some micro-benchmarking, I replied to Masami with a few details on how to build and use it. > > > > BASELINE > > ======== > > kprobe : 24.634 ± 0.205M/s > > kprobe-multi : 28.898 ± 0.531M/s > > kretprobe : 10.478 ± 0.015M/s > > kretprobe-multi: 11.012 ± 0.063M/s > > > > THIS PATCH SET ON TOP > > ===================== > > kprobe : 25.144 ± 0.027M/s (+2%) > > kprobe-multi : 28.909 ± 0.074M/s > > kretprobe : 9.482 ± 0.008M/s (-9.5%) > > kretprobe-multi: 13.688 ± 0.027M/s (+24%) > > > > These numbers are pretty stable and look to be more or less representative. > > Thanks for running this. > > > > > As you can see, kprobes got a bit faster, kprobe-multi seems to be > > about the same, though. > > > > Then (I suppose they are "legacy") kretprobes got quite noticeably > > slower, almost by 10%. Not sure why, but looks real after re-running > > benchmarks a bunch of times and getting stable results. > > > > On the other hand, multi-kretprobes got significantly faster (+24%!). > > Again, I don't know if it is expected or not, but it's a nice > > improvement. > > > > If you have any idea why kretprobes would get so much slower, it would > > be nice to look into that and see if you can mitigate the regression > > somehow. Thanks! > > My guess is that this patch set helps generic use cases for tracing the > return of functions, but will likely add more overhead for single use > cases. That is, kretprobe is made to be specific for a single function, > but kretprobe-multi is more generic. Hence the generic version will > improve at the sacrifice of the specific function. I did expect as much. > > That said, I think there's probably a lot of low hanging fruit that can > be done to this series to help improve the kretprobe performance. I'm > not sure we can get back to the baseline, but I'm hoping we can at > least make it much better than that 10% slowdown. That would certainly be appreciated, thanks! But I'm also considering trying to switch to multi-kprobe/kretprobe automatically on libbpf side, whenever possible, so that users can get the best performance. There might still be situations where this can't be done, so singular kprobe/kretprobe can't be completely deprecated, but multi variants seems to be universally faster, so I'm going to make them a default (I need to handle some backwards compat aspect, but that's libbpf-specific stuff you shouldn't be concerned with). > > I'll be reviewing this patch set this week as I recover from jetlag. > > -- Steve