On 2019/08/14 16:33, Toshiaki Makita wrote:
bpf, hashtab: Compare keys in long
3Mpps vs 4Mpps just from this patch ?
or combined with i40 prefech patch ?
Combined.
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 1 +
Could you share "perf report" for just hash tab optimization
and for i40 ?
Sure, I'll get some more data and post them.
Here are perf report and performance numbers.
This time for some reason the performance is better than before.
Something in my env may have changed but could not identify that.
I cut and paste top 10 functions from perf report with drop rate for each case.
perf report is run with --no-child option, so does not include child functions load.
It looks like the hottest function is always xdp_flow BPF program for XDP,
but the shown function name is some meaningless one, like __this_module+0x800000007446.
- No prefetch, no long-compare
3.3 Mpps
25.22% ksoftirqd/4 [kernel.kallsyms] [k] __this_module+0x800000007446
21.64% ksoftirqd/4 [kernel.kallsyms] [k] __htab_map_lookup_elem
14.93% ksoftirqd/4 [kernel.kallsyms] [k] memcmp
7.07% ksoftirqd/4 [kernel.kallsyms] [k] i40e_clean_rx_irq
4.57% ksoftirqd/4 [kernel.kallsyms] [k] dev_map_enqueue
3.60% ksoftirqd/4 [kernel.kallsyms] [k] lookup_nulls_elem_raw
3.44% ksoftirqd/4 [kernel.kallsyms] [k] page_frag_free
2.69% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_rcv
2.29% ksoftirqd/4 [kernel.kallsyms] [k] xdp_do_redirect
1.51% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_xmit
- With prefetch, no long-compare
3.7 Mpps
25.02% ksoftirqd/4 [kernel.kallsyms] [k] mirred_list_lock+0x800000008052
21.52% ksoftirqd/4 [kernel.kallsyms] [k] __htab_map_lookup_elem
13.20% ksoftirqd/4 [kernel.kallsyms] [k] memcmp
7.38% ksoftirqd/4 [kernel.kallsyms] [k] i40e_clean_rx_irq
4.09% ksoftirqd/4 [kernel.kallsyms] [k] lookup_nulls_elem_raw
3.57% ksoftirqd/4 [kernel.kallsyms] [k] dev_map_enqueue
3.50% ksoftirqd/4 [kernel.kallsyms] [k] page_frag_free
2.86% ksoftirqd/4 [kernel.kallsyms] [k] xdp_do_redirect
2.84% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_rcv
1.79% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_xmit
- No prefetch, with long-compare
4.2 Mpps
24.64% ksoftirqd/4 [kernel.kallsyms] [k] __this_module+0x800000008f47
24.42% ksoftirqd/4 [kernel.kallsyms] [k] __htab_map_lookup_elem
6.91% ksoftirqd/4 [kernel.kallsyms] [k] i40e_clean_rx_irq
4.04% ksoftirqd/4 [kernel.kallsyms] [k] page_frag_free
3.53% ksoftirqd/4 [kernel.kallsyms] [k] lookup_nulls_elem_raw
3.14% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_rcv
3.13% ksoftirqd/4 [kernel.kallsyms] [k] dev_map_enqueue
2.34% ksoftirqd/4 [kernel.kallsyms] [k] xdp_do_redirect
1.76% ksoftirqd/4 [kernel.kallsyms] [k] key_equal
1.37% ksoftirqd/4 [kernel.kallsyms] [k] zero_key+0x800000010e93
NOTE: key_equal is called in place of memcmp.
- With prefetch, with long-compare
4.6 Mpps
26.68% ksoftirqd/4 [kernel.kallsyms] [k] mirred_list_lock+0x80000000a109
22.37% ksoftirqd/4 [kernel.kallsyms] [k] __htab_map_lookup_elem
10.79% ksoftirqd/4 [kernel.kallsyms] [k] i40e_clean_rx_irq
4.74% ksoftirqd/4 [kernel.kallsyms] [k] page_frag_free
4.09% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_rcv
3.97% ksoftirqd/4 [kernel.kallsyms] [k] dev_map_enqueue
3.79% ksoftirqd/4 [kernel.kallsyms] [k] lookup_nulls_elem_raw
3.09% ksoftirqd/4 [kernel.kallsyms] [k] xdp_do_redirect
2.45% ksoftirqd/4 [kernel.kallsyms] [k] key_equal
1.91% ksoftirqd/4 [kernel.kallsyms] [k] veth_xdp_xmit
Toshiaki Makita