Re: [PATCH bpf-next 0/8] Simplify xdp_do_redirect_map()/xdp_do_flush_map() and XDP maps

Björn Töpel <bjorn.topel@xxxxxxxxx> · Wed, 18 Dec 2019 13:48:27 +0100

On Wed, 18 Dec 2019 at 13:40, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote:
>
> On Wed, 18 Dec 2019 13:18:10 +0100
> Björn Töpel <bjorn.topel@xxxxxxxxx> wrote:
>
> > On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote:
> > >
> > > On Wed, 18 Dec 2019 12:39:53 +0100
> > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote:
> > >
> > > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, 18 Dec 2019 11:53:52 +0100
> > > > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote:
> > > > >
> > > > > >   $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0
> > > > > >
> > > > > >   Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs
> > > > > >   XDP-cpumap      CPU:to  pps            drop-pps    extra-info
> > > > > >   XDP-RX          20      7723038        0           0
> > > > > >   XDP-RX          total   7723038        0
> > > > > >   cpumap_kthread  total   0              0           0
> > > > > >   redirect_err    total   0              0
> > > > > >   xdp_exception   total   0              0
> > > > >
> > > > > Hmm... I'm missing some counters on the kthread side.
> > > > >
> > > >
> > > > Oh? Any ideas why? I just ran the upstream sample straight off.
> > >
> > > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert
> > > XDP samples to libbpf usage") (Cc Maciej).
> > >
> > > The old bpf_load.c will auto attach the tracepoints... for and libbpf
> > > you have to be explicit about it.
> > >
> > > Can I ask you to also run a test with --stress-mode for
> > > ./xdp_redirect_cpu, to flush out any potential RCU race-conditions
> > > (don't provide output, this is just a robustness test).
> > >
> >
> > Sure! Other than that, does the command line above make sense? I'm
> > blasting UDP packets to core 20, and the idea was to re-route them to
> > 22.
>
> Yes, and I love that you are using CPUMAP xdp_redirect_cpu as a test.
>
> Explaining what is doing on (so you can say if this is what you wanted
> to test):
>

I wanted to see whether one could receive (Rx + bpf_redirect_map)
more with the change. I figured out that at least bpf_redirect_map was
correctly executed, and that the numbers went up. :-P

> The "XDP-RX" number is the raw XDP redirect number, but the remote CPU,
> where the network stack is started, cannot operate at 7.7Mpps.  Which the
> lacking tracepoint numbers should have shown. You still can observe
> results via nstat, e.g.:
>
>  # nstat -n && sleep 1 && nstat
>
> On the remote CPU 22, the SKB will be constructed, and likely dropped
> due overloading network stack and due to not having an UDP listen port.
>
> I sometimes use:
>  # iptables -t raw -I PREROUTING -p udp --dport 9 -j DROP
> To drop the UDP packets in a earlier and consistent stage.
>
> The CPUMAP have carefully been designed to avoid that a "producer" can
> be slowed down by memory operations done by the "consumer", this is
> mostly achieved via ptr_ring and careful bulking (cache-lines).  As
> your driver i40e doesn't have 'page_pool', then you are not affected by
> the return channel.
>
> Funny test/details: i40e uses a refcnt recycle scheme, based off the
> size of the RX-ring, thus it is affected by a longer outstanding queue.
> The CPUMAP have an intermediate queue, that will be full in this
> overload setting.  Try to increase or decrease the parameter --qsize
> (remember to place it as first argument), and see if this was the
> limiting factor for your XDP-RX number.
>

Thanks for the elaborate description!

(Maybe it's time for samples/bpf manpages? ;-))

Björn