On Wed, May 27, 2020 at 12:21:54PM +0200, Toke Høiland-Jørgensen wrote: > > The example in patch 2 is functional, but not a lot of effort > > has been made on performance optimisation. I did a simple test(pkt size 64) > > with pktgen. Here is the test result with BPF_MAP_TYPE_DEVMAP_HASH > > arrays: > > > > bpf_redirect_map() with 1 ingress, 1 egress: > > generic path: ~1600k pps > > native path: ~980k pps > > > > bpf_redirect_map_multi() with 1 ingress, 3 egress: > > generic path: ~600k pps > > native path: ~480k pps > > > > bpf_redirect_map_multi() with 1 ingress, 9 egress: > > generic path: ~125k pps > > native path: ~100k pps > > > > The bpf_redirect_map_multi() is slower than bpf_redirect_map() as we loop > > the arrays and do clone skb/xdpf. The native path is slower than generic > > path as we send skbs by pktgen. So the result looks reasonable. > > How are you running these tests? Still on virtual devices? We really I run it with the test topology in patch 2/2. The test is run on physical machines, but I use veth interface. Do you mean use a physical NIC driver for testing? BTW, when using pktgen, I got an panic because the skb don't have enough header room. The code path looks like do_xdp_generic() - netif_receive_generic_xdp() - skb_headroom(skb) < XDP_PACKET_HEADROOM - pskb_expand_head() - BUG_ON(skb_shared(skb)) So I added a draft patch for pktgen, not sure if it has any influence. index 08e2811b5274..fee17310c178 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -170,6 +170,7 @@ #include <linux/uaccess.h> #include <asm/dma.h> #include <asm/div64.h> /* do_div */ +#include <linux/bpf.h> #define VERSION "2.75" #define IP_NAME_SZ 32 @@ -2692,7 +2693,7 @@ static void pktgen_finalize_skb(struct pktgen_dev *pkt_dev, struct sk_buff *skb, static struct sk_buff *pktgen_alloc_skb(struct net_device *dev, struct pktgen_dev *pkt_dev) { - unsigned int extralen = LL_RESERVED_SPACE(dev); + unsigned int extralen = LL_RESERVED_SPACE(dev) + XDP_PACKET_HEADROOM; struct sk_buff *skb = NULL; unsigned int size; > need results from a physical setup in native mode to assess the impact > on the native-XDP fast path. The numbers above don't tell much in this > regard. I'd also like to see a before/after patch for straight > bpf_redirect_map(), since you're messing with the fast path, and we want > to make sure it's not causing a performance regression for regular > redirect. OK, I will write a test with 1 ingress + 1 egress for bpf_redirect_map_multi. Just as Eelco said. > > Finally, since the overhead seems to be quite substantial: A comparison > with a regular network stack bridge might make sense? After all we also > want to make sure it's a performance win over that :) OK, Will do it. Thanks Hangbin