On Mon, 2017-04-10 at 17:41 -0400, Andy Gospodarek wrote: > Results with bpf_git_enable=1 and gro off, still only 7.5Mpps. 7.5 Mpps is already a nice number. We all understand that using native XDP in the driver might give some extra pps, but for the ones that do not want to patch old drivers, it is probably good enough. > > 26.34% ksoftirqd/5 [kernel.vmlinux] [k] memcpy_erms Can you investigate to see what are the call graphs ? Is this from copying headers to skb->head from bnxt driver ? Some kind of copybreak maybe ? perf record -a -g sleep 5 perf report --stdio Copybreak is generally not really useful, and can have downsides. Much better to let upper stacks deciding this. For example, there is no point doing copy break for TCP ACK packets that are going to be consumed immediately. There is also no point doing copy break in case the packet will be dropped (say by ... XDP ;) ) > 14.79% ksoftirqd/5 [bnxt_en] [k] bnxt_rx_pkt > 10.11% ksoftirqd/5 [kernel.vmlinux] [k] __build_skb > 5.01% ksoftirqd/5 [kernel.vmlinux] [k] page_frag_free > 4.66% ksoftirqd/5 [kernel.vmlinux] [k] kmem_cache_alloc > 4.19% ksoftirqd/5 [kernel.vmlinux] [k] kmem_cache_free > 3.67% ksoftirqd/5 [bnxt_en] [k] bnxt_poll > 2.97% ksoftirqd/5 [kernel.vmlinux] [k] netif_receive_skb_internal > 2.24% ksoftirqd/5 [kernel.vmlinux] [k] __napi_alloc_skb > 1.92% ksoftirqd/5 [kernel.vmlinux] [k] eth_type_trans > 1.78% ksoftirqd/5 [bnxt_en] [k] bnxt_rx_xdp > 1.62% ksoftirqd/5 [kernel.vmlinux] [k] net_rx_action