On 2024-05-22 09:09:45 [+0200], Jesper Dangaard Brouer wrote: > For this benchmark, to focus, I would reduce this to: > # perf report --sort cpu,symbol --no-children Keeping the bpf_net_ctx_set()/clear, removing the NULL checks (to align with Alexei in his last email). Perf numbers wise, I'm using xdp-bench redirect-cpu --cpu 3 --remote-action drop eth1 -e unpached: | eth1->? 9,427,705 rx/s 0 err,drop/s | receive total 9,427,705 pkt/s 0 drop/s 0 error/s | cpu:17 9,427,705 pkt/s 0 drop/s 0 error/s | enqueue to cpu 3 9,427,708 pkt/s 0 drop/s 8.00 bulk-avg | cpu:17->3 9,427,708 pkt/s 0 drop/s 8.00 bulk-avg | kthread total 9,427,710 pkt/s 0 drop/s 147,276 sched | cpu:3 9,427,710 pkt/s 0 drop/s 147,276 sched | xdp_stats 0 pass/s 9,427,710 drop/s 0 redir/s | cpu:3 0 pass/s 9,427,710 drop/s 0 redir/s | redirect_err 0 error/s | xdp_exception 0 hit/s Patched: | eth1->? 9,557,170 rx/s 0 err,drop/s | receive total 9,557,170 pkt/s 0 drop/s 0 error/s | cpu:9 9,557,170 pkt/s 0 drop/s 0 error/s | enqueue to cpu 3 9,557,170 pkt/s 0 drop/s 8.00 bulk-avg | cpu:9->3 9,557,170 pkt/s 0 drop/s 8.00 bulk-avg | kthread total 9,557,195 pkt/s 0 drop/s 126,164 sched | cpu:3 9,557,195 pkt/s 0 drop/s 126,164 sched | xdp_stats 0 pass/s 9,557,195 drop/s 0 redir/s | cpu:3 0 pass/s 9,557,195 drop/s 0 redir/s | redirect_err 0 error/s | xdp_exception 0 hit/s I think this is noise. perf output as suggested (perf report --sort cpu,symbol --no-children). unpatched: | 19.05% 017 [k] bpf_prog_4f0ffbb35139c187_cpumap_l4_hash | 11.40% 017 [k] ixgbe_poll | 10.68% 003 [k] cpu_map_kthread_run | 7.62% 003 [k] intel_idle | 6.11% 017 [k] xdp_do_redirect | 6.01% 003 [k] page_frag_free | 4.72% 017 [k] bq_flush_to_queue | 3.74% 017 [k] cpu_map_redirect | 2.35% 003 [k] xdp_return_frame | 1.55% 003 [k] bpf_prog_57cd311f2e27366b_cpumap_drop | 1.49% 017 [k] dma_sync_single_for_device | 1.41% 017 [k] ixgbe_alloc_rx_buffers | 1.26% 017 [k] cpu_map_enqueue | 1.24% 017 [k] dma_sync_single_for_cpu | 1.12% 003 [k] __xdp_return | 0.83% 017 [k] bpf_trace_run4 | 0.77% 003 [k] __switch_to patched: | 18.20% 009 [k] bpf_prog_4f0ffbb35139c187_cpumap_l4_hash | 11.64% 009 [k] ixgbe_poll | 7.74% 003 [k] page_frag_free | 6.69% 003 [k] cpu_map_bpf_prog_run_xdp | 6.02% 003 [k] intel_idle | 5.96% 009 [k] xdp_do_redirect | 4.45% 003 [k] cpu_map_kthread_run | 3.71% 009 [k] cpu_map_redirect | 3.23% 009 [k] bq_flush_to_queue | 2.55% 003 [k] xdp_return_frame | 1.67% 003 [k] bpf_prog_57cd311f2e27366b_cpumap_drop | 1.60% 009 [k] _raw_spin_lock | 1.57% 009 [k] bpf_prog_d7eca17ddc334d36_tp_xdp_cpumap_enqueue | 1.48% 009 [k] dma_sync_single_for_device | 1.47% 009 [k] ixgbe_alloc_rx_buffers | 1.39% 009 [k] dma_sync_single_for_cpu | 1.33% 009 [k] cpu_map_enqueue | 1.19% 003 [k] __xdp_return | 0.66% 003 [k] __switch_to I'm going to repost the series once the merge window closes unless there is something you want me to do. > --Jesper Sebastian