Re: [PATCH bpf-next v3 0/6] Introduce the BPF dispatcher

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 9 Dec 2019 at 18:00, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote:
>
> On Mon,  9 Dec 2019 14:55:16 +0100
> Björn Töpel <bjorn.topel@xxxxxxxxx> wrote:
>
> > Performance
> > ===========
> >
> > The tests were performed using the xdp_rxq_info sample program with
> > the following command-line:
> >
> > 1. XDP_DRV:
> >   # xdp_rxq_info --dev eth0 --action XDP_DROP
> > 2. XDP_SKB:
> >   # xdp_rxq_info --dev eth0 -S --action XDP_DROP
> > 3. xdp-perf, from selftests/bpf:
> >   # test_progs -v -t xdp_perf
> >
> >
> > Run with mitigations=auto
> > -------------------------
> >
> > Baseline:
> > 1. 22.0 Mpps
> > 2. 3.8 Mpps
> > 3. 15 ns
> >
> > Dispatcher:
> > 1. 29.4 Mpps (+34%)
> > 2. 4.0 Mpps  (+5%)
> > 3. 5 ns      (+66%)
>
> Thanks for providing these extra measurement points.  This is good
> work.  I just want to remind people that when working at these high
> speeds, it is easy to get amazed by a +34% improvement, but we have to
> be careful to understand that this is saving approx 10 ns time or
> cycles.
>
> In reality cycles or time saved in #2 (3.8 Mpps -> 4.0 Mpps) is larger
> (1/3.8-1/4)*1000 = 13.15 ns.  Than #1 (22.0 Mpps -> 29.4 Mpps)
> (1/22-1/29.4)*1000 = 11.44 ns. Test #3 keeps us honest 15 ns -> 5 ns =
> 10 ns.  The 10 ns improvement is a big deal in XDP context, and also
> correspond to my own experience with retpoline (approx 12 ns overhead).
>

Ok, good! :-)

> To Bjørn, I would appreciate more digits on your Mpps numbers, so I get
> more accuracy on my checks-and-balances I described above.  I suspect
> the 3.8 Mpps -> 4.0 Mpps will be closer to the other numbers when we
> get more accuracy.
>

Ok! Let me re-run them. If you have some spare cycles, yt would be
great if you could try it out as well on your Mellanox setup.
Historically you've always been able to get more stable numbers than
I. :-)

>
> > Dispatcher (full; walk all entries, and fallback):
> > 1. 20.4 Mpps (-7%)
> > 2. 3.8 Mpps
> > 3. 18 ns     (-20%)
> >
> > Run with mitigations=off
> > ------------------------
> >
> > Baseline:
> > 1. 29.6 Mpps
> > 2. 4.1 Mpps
> > 3. 5 ns
> >
> > Dispatcher:
> > 1. 30.7 Mpps (+4%)
> > 2. 4.1 Mpps
> > 3. 5 ns
>
> While +4% sounds good, but could be measurement noise ;-)
>
>  (1/29.6-1/30.7)*1000 = 1.21 ns
>
> As both #3 says 5 ns.
>

True. Maybe that simply hints that we shouldn't use the dispatcher here?


Thanks for the comments!
Björn


> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux