On Wed, 18 Dec 2019 at 13:40, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > > On Wed, 18 Dec 2019 13:18:10 +0100 > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > > > On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > > > > > > On Wed, 18 Dec 2019 12:39:53 +0100 > > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > > > > > > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > > > > > > > > > > On Wed, 18 Dec 2019 11:53:52 +0100 > > > > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > > > > > > > > > > > $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0 > > > > > > > > > > > > Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs > > > > > > XDP-cpumap CPU:to pps drop-pps extra-info > > > > > > XDP-RX 20 7723038 0 0 > > > > > > XDP-RX total 7723038 0 > > > > > > cpumap_kthread total 0 0 0 > > > > > > redirect_err total 0 0 > > > > > > xdp_exception total 0 0 > > > > > > > > > > Hmm... I'm missing some counters on the kthread side. > > > > > > > > > > > > > Oh? Any ideas why? I just ran the upstream sample straight off. > > > > > > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert > > > XDP samples to libbpf usage") (Cc Maciej). > > > > > > The old bpf_load.c will auto attach the tracepoints... for and libbpf > > > you have to be explicit about it. > > > > > > Can I ask you to also run a test with --stress-mode for > > > ./xdp_redirect_cpu, to flush out any potential RCU race-conditions > > > (don't provide output, this is just a robustness test). > > > > > > > Sure! Other than that, does the command line above make sense? I'm > > blasting UDP packets to core 20, and the idea was to re-route them to > > 22. > > Yes, and I love that you are using CPUMAP xdp_redirect_cpu as a test. > > Explaining what is doing on (so you can say if this is what you wanted > to test): > I wanted to see whether one could receive (Rx + bpf_redirect_map) more with the change. I figured out that at least bpf_redirect_map was correctly executed, and that the numbers went up. :-P > The "XDP-RX" number is the raw XDP redirect number, but the remote CPU, > where the network stack is started, cannot operate at 7.7Mpps. Which the > lacking tracepoint numbers should have shown. You still can observe > results via nstat, e.g.: > > # nstat -n && sleep 1 && nstat > > On the remote CPU 22, the SKB will be constructed, and likely dropped > due overloading network stack and due to not having an UDP listen port. > > I sometimes use: > # iptables -t raw -I PREROUTING -p udp --dport 9 -j DROP > To drop the UDP packets in a earlier and consistent stage. > > The CPUMAP have carefully been designed to avoid that a "producer" can > be slowed down by memory operations done by the "consumer", this is > mostly achieved via ptr_ring and careful bulking (cache-lines). As > your driver i40e doesn't have 'page_pool', then you are not affected by > the return channel. > > Funny test/details: i40e uses a refcnt recycle scheme, based off the > size of the RX-ring, thus it is affected by a longer outstanding queue. > The CPUMAP have an intermediate queue, that will be full in this > overload setting. Try to increase or decrease the parameter --qsize > (remember to place it as first argument), and see if this was the > limiting factor for your XDP-RX number. > Thanks for the elaborate description! (Maybe it's time for samples/bpf manpages? ;-)) Björn