Re: [PATCHv9 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support

Hangbin Liu <liuhangbin@xxxxxxxxx> · Mon, 26 Apr 2021 18:25:52 +0800

On Mon, Apr 26, 2021 at 11:23:08AM +0200, Jesper Dangaard Brouer wrote:
> > I re-check the performance data. The data
> > > Version          | Test                                | Generic | Native
> > > 5.12 rc4         | redirect_map        i40e->i40e      |    1.9M |  9.6M
> > > 5.12 rc4 + patch | redirect_map        i40e->i40e      |    1.9M |  9.3M  
> > 
> > is done on version 5.
> > 
> > Today I re-did the test, on version 10, with xchg() changed to
> > READ_ONCE/WRITE_ONCE. Here is the new data (Generic path data was omitted
> > as there is no change)
> > 
> > Version          | Test                                | Generic | Native
> > 5.12 rc4         | redirect_map        i40e->i40e      |  9.7M
> > 5.12 rc4         | redirect_map        i40e->veth      | 11.8M
> > 
> > 5.12 rc4 + patch | redirect_map        i40e->i40e      |  9.6M
> 
> Great to see the baseline redirect_map (i40e->i40e) have almost no
> impact, only 1.07 ns ((1/9.7-1/9.6)*1000), which is what we want to
> see.  (It might be zero as measurements can fluctuate when diff is
> below 2ns)
> 
> 
> > 5.12 rc4 + patch | redirect_map        i40e->veth      | 11.6M
> 
> What XDP program are you running on the inner veth?

XDP_DROP

> 
> > 5.12 rc4 + patch | redirect_map multi  i40e->i40e      |  9.5M
> 
> I'm very surprised to see redirect_map multi being so fast (9.5M vs.
> 9.6M normal map-redir).  I was expecting to see larger overhead, as the

Yes, with only hash map size 4, one port to one port redirect. The impact are
mainly on looping the map. (This info will be updated to new patch set
description)

> code dev_map_enqueue_clone() would clone the packet in xdpf_clone() via
> allocating a new page (dev_alloc_page) and then doing a memcpy().
> 
> Looking closer at this patchset, I realize that the test
> 'redirect_map-multi' is testing an optimization, and will never call
> dev_map_enqueue_clone() + xdpf_clone().  IMHO trying to optimize
> 'redirect_map-multi' to be just as fast as base 'redirect_map' doesn't
> make much sense.  If the 'broadcast' call only send a single packet,
> then there isn't any reason to call the 'multi' variant.

Yes, that's why there are also i40e->mlx4+veth test.

> 
> Does the 'selftests/bpf' make sure to activate the code path that does
> cloning?

Yes, selftest will redirect packets to 2 or 3 other interfaces.

> 
> > 5.12 rc4 + patch | redirect_map multi  i40e->veth      | 11.5M
> > 5.12 rc4 + patch | redirect_map multi  i40e->mlx4+veth |  3.9M
> > 
> > And after add unlikely() in the check path, the new data looks like
> > 
> > Version          | Test                                | Native
> > 5.12 rc4 + patch | redirect_map        i40e->i40e      |  9.6M
> > 5.12 rc4 + patch | redirect_map        i40e->veth      | 11.7M
> > 5.12 rc4 + patch | redirect_map multi  i40e->i40e      |  9.4M
> > 5.12 rc4 + patch | redirect_map multi  i40e->veth      | 11.4M
> > 5.12 rc4 + patch | redirect_map multi  i40e->mlx4+veth |  3.8M
> > 
> > So with unlikely(), the redirect_map is a slightly up, while redirect_map
> > broadcast has a little drawback. But for the total data it looks this time
> > there is no much gap compared with no this patch for redirect_map.
> > 
> > Do you think we still need the unlikely() in check path?
> 
> Yes.  The call to redirect_map multi is allowed (and expected) to be
> slower, because when using it to broadcast packets we expect that
> dev_map_enqueue_clone() + xdpf_clone() will get activated, which will
> be the dominating overhead.

OK, I will.

Hangbin