On Tue, 8 Oct 2019 18:51:19 -0700 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > On Tue, Oct 08, 2019 at 10:07:46AM +0200, Toke Høiland-Jørgensen wrote: > > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes: > > > > > On Mon, Oct 07, 2019 at 07:20:36PM +0200, Toke Høiland-Jørgensen wrote: > > >> From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > > >> > > >> This adds support for wrapping eBPF program dispatch in chain calling > > >> logic. The code injection is controlled by a flag at program load time; if > > >> the flag is set, the BPF program will carry a flag bit that changes the > > >> program dispatch logic to wrap it in a chain call loop. [...] > > >> diff --git a/include/linux/filter.h b/include/linux/filter.h > > >> index 2ce57645f3cd..3d1e4991e61d 100644 > > >> --- a/include/linux/filter.h > > >> +++ b/include/linux/filter.h [...] > > >> #define BPF_PROG_RUN(prog, ctx) ({ \ > > >> @@ -559,14 +585,18 @@ DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); > > >> if (static_branch_unlikely(&bpf_stats_enabled_key)) { \ > > >> struct bpf_prog_stats *stats; \ > > >> u64 start = sched_clock(); \ > > >> - ret = (*(prog)->bpf_func)(ctx, (prog)->insnsi); \ > > >> + ret = prog->chain_calls ? \ > > >> + do_chain_calls(prog, ctx) : \ > > >> + (*(prog)->bpf_func)(ctx, (prog)->insnsi); \ > > > > > > I thought you agreed on 'no performance regressions' rule? > > > > As I wrote in the cover letter I could not measurable a performance > > impact from this, even with the simplest possible XDP program (where > > program setup time has the largest impact). > > > > This was the performance before/after patch (also in the cover letter): > > > > Before patch (XDP DROP program): 31.5 Mpps > > After patch (XDP DROP program): 32.0 Mpps > > > > So actually this *increases* performance ;) > > (Or rather, the difference is within the measurement uncertainty on my > > system). > > I have hard time believing such numbers. Don't look at this as +/- 500Kpps. Instead you have to realize that the performance difference in ns (nano-seconds) is only 0.5 ns (0.496 ns). (1/31.5*1000)-(1/32.0*1000) = 0.4960 ns This "half-a-nanosec" is below the measurement uncertainty of any system. My system have approx 2-3 ns measurement variance, which I'm proud of. At these speeds (32Mpps) this e.g. 3 ns variance would result in -2.8 Mpps (29.2Mpps). The change Toke did in BPF_PROG_RUN does not introduce any measurable performance change, as least on high-end Intel CPUs. This DOES satisfy 'no performance regressions' rule. You can dislike the solution for other reasons ;-) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer