Re: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other

Jesper Dangaard Brouer <brouer@xxxxxxxxxx> · Wed, 9 Oct 2019 12:19:55 +0200

On Tue, 8 Oct 2019 18:51:19 -0700
Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:

> On Tue, Oct 08, 2019 at 10:07:46AM +0200, Toke Høiland-Jørgensen wrote:
> > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:
> >   
> > > On Mon, Oct 07, 2019 at 07:20:36PM +0200, Toke Høiland-Jørgensen wrote:  
> > >> From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> > >> 
> > >> This adds support for wrapping eBPF program dispatch in chain calling
> > >> logic. The code injection is controlled by a flag at program load time; if
> > >> the flag is set, the BPF program will carry a flag bit that changes the
> > >> program dispatch logic to wrap it in a chain call loop.
[...]
> > >> diff --git a/include/linux/filter.h b/include/linux/filter.h
> > >> index 2ce57645f3cd..3d1e4991e61d 100644
> > >> --- a/include/linux/filter.h
> > >> +++ b/include/linux/filter.h
[...]
> > >>  #define BPF_PROG_RUN(prog, ctx)	({				\
> > >> @@ -559,14 +585,18 @@ DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key);
> > >>  	if (static_branch_unlikely(&bpf_stats_enabled_key)) {	\
> > >>  		struct bpf_prog_stats *stats;			\
> > >>  		u64 start = sched_clock();			\
> > >> -		ret = (*(prog)->bpf_func)(ctx, (prog)->insnsi);	\
> > >> +		ret = prog->chain_calls ?			\
> > >> +			do_chain_calls(prog, ctx) :			\
> > >> +			 (*(prog)->bpf_func)(ctx, (prog)->insnsi);	\  
> > >
> > > I thought you agreed on 'no performance regressions' rule?  
> > 
> > As I wrote in the cover letter I could not measurable a performance
> > impact from this, even with the simplest possible XDP program (where
> > program setup time has the largest impact).
> > 
> > This was the performance before/after patch (also in the cover letter):
> > 
> > Before patch (XDP DROP program):  31.5 Mpps
> > After patch (XDP DROP program):   32.0 Mpps
> > 
> > So actually this *increases* performance ;)
> > (Or rather, the difference is within the measurement uncertainty on my
> > system).  
> 
> I have hard time believing such numbers.

Don't look at this as +/- 500Kpps.  Instead you have to realize that the
performance difference in ns (nano-seconds) is only 0.5 ns (0.496 ns).

 (1/31.5*1000)-(1/32.0*1000) = 0.4960 ns

This "half-a-nanosec" is below the measurement uncertainty of any
system. My system have approx 2-3 ns measurement variance, which I'm
proud of.  At these speeds (32Mpps) this e.g. 3 ns variance would
result in -2.8 Mpps (29.2Mpps).

The change Toke did in BPF_PROG_RUN does not introduce any measurable
performance change, as least on high-end Intel CPUs.  This DOES satisfy
'no performance regressions' rule.  You can dislike the solution for
other reasons ;-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer