On Wed, 18 Dec 2019 13:18:10 +0100 Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > On Wed, 18 Dec 2019 at 13:04, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > > > > On Wed, 18 Dec 2019 12:39:53 +0100 > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > > > > > On Wed, 18 Dec 2019 at 12:11, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > > > > > > > > On Wed, 18 Dec 2019 11:53:52 +0100 > > > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > > > > > > > > > $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0 > > > > > > > > > > Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs > > > > > XDP-cpumap CPU:to pps drop-pps extra-info > > > > > XDP-RX 20 7723038 0 0 > > > > > XDP-RX total 7723038 0 > > > > > cpumap_kthread total 0 0 0 > > > > > redirect_err total 0 0 > > > > > xdp_exception total 0 0 > > > > > > > > Hmm... I'm missing some counters on the kthread side. > > > > > > > > > > Oh? Any ideas why? I just ran the upstream sample straight off. > > > > Looks like it happened in commit: bbaf6029c49c ("samples/bpf: Convert > > XDP samples to libbpf usage") (Cc Maciej). > > > > The old bpf_load.c will auto attach the tracepoints... for and libbpf > > you have to be explicit about it. > > > > Can I ask you to also run a test with --stress-mode for > > ./xdp_redirect_cpu, to flush out any potential RCU race-conditions > > (don't provide output, this is just a robustness test). > > > > Sure! Other than that, does the command line above make sense? I'm > blasting UDP packets to core 20, and the idea was to re-route them to > 22. Yes, and I love that you are using CPUMAP xdp_redirect_cpu as a test. Explaining what is doing on (so you can say if this is what you wanted to test): The "XDP-RX" number is the raw XDP redirect number, but the remote CPU, where the network stack is started, cannot operate at 7.7Mpps. Which the lacking tracepoint numbers should have shown. You still can observe results via nstat, e.g.: # nstat -n && sleep 1 && nstat On the remote CPU 22, the SKB will be constructed, and likely dropped due overloading network stack and due to not having an UDP listen port. I sometimes use: # iptables -t raw -I PREROUTING -p udp --dport 9 -j DROP To drop the UDP packets in a earlier and consistent stage. The CPUMAP have carefully been designed to avoid that a "producer" can be slowed down by memory operations done by the "consumer", this is mostly achieved via ptr_ring and careful bulking (cache-lines). As your driver i40e doesn't have 'page_pool', then you are not affected by the return channel. Funny test/details: i40e uses a refcnt recycle scheme, based off the size of the RX-ring, thus it is affected by a longer outstanding queue. The CPUMAP have an intermediate queue, that will be full in this overload setting. Try to increase or decrease the parameter --qsize (remember to place it as first argument), and see if this was the limiting factor for your XDP-RX number. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer