Re: [PATCHv9 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 24 Apr 2021 09:09:25 +0800
Hangbin Liu <liuhangbin@xxxxxxxxx> wrote:

> On Fri, Apr 23, 2021 at 06:54:29PM +0200, Jesper Dangaard Brouer wrote:
> > On Thu, 22 Apr 2021 20:02:18 +0200
> > Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >   
> > > Jesper Dangaard Brouer <brouer@xxxxxxxxxx> writes:
> > >   
> > > > On Thu, 22 Apr 2021 15:14:52 +0800
> > > > Hangbin Liu <liuhangbin@xxxxxxxxx> wrote:
> > > >    
> > > >> diff --git a/net/core/filter.c b/net/core/filter.c
> > > >> index cae56d08a670..afec192c3b21 100644
> > > >> --- a/net/core/filter.c
> > > >> +++ b/net/core/filter.c    
> > > > [...]    
> > > >>  int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
> > > >>  		    struct bpf_prog *xdp_prog)
> > > >>  {
> > > >> @@ -3933,6 +3950,7 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
> > > >>  	enum bpf_map_type map_type = ri->map_type;
> > > >>  	void *fwd = ri->tgt_value;
> > > >>  	u32 map_id = ri->map_id;
> > > >> +	struct bpf_map *map;
> > > >>  	int err;
> > > >>  
> > > >>  	ri->map_id = 0; /* Valid map id idr range: [1,INT_MAX[ */
> > > >> @@ -3942,7 +3960,12 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
> > > >>  	case BPF_MAP_TYPE_DEVMAP:
> > > >>  		fallthrough;
> > > >>  	case BPF_MAP_TYPE_DEVMAP_HASH:
> > > >> -		err = dev_map_enqueue(fwd, xdp, dev);
> > > >> +		map = xchg(&ri->map, NULL);    
> > > >
> > > > Hmm, this looks dangerous for performance to have on this fast-path.
> > > > The xchg call can be expensive, AFAIK this is an atomic operation.    
> > > 
> > > Ugh, you're right. That's my bad, I suggested replacing the
> > > READ_ONCE()/WRITE_ONCE() pair with the xchg() because an exchange is
> > > what it's doing, but I failed to consider the performance implications
> > > of the atomic operation. Sorry about that, Hangbin! I guess this should
> > > be changed to:
> > > 
> > > +		map = READ_ONCE(ri->map);
> > > +		if (map) {
> > > +			WRITE_ONCE(ri->map, NULL);
> > > +			err = dev_map_enqueue_multi(xdp, dev, map,
> > > +						    ri->flags & BPF_F_EXCLUDE_INGRESS);
> > > +		} else {
> > > +			err = dev_map_enqueue(fwd, xdp, dev);
> > > +		}  
> > 
> > This is highly sensitive fast-path code, as you saw Bjørn have been
> > hunting nanosec in this area.  The above code implicitly have "map" as
> > the likely option, which I don't think it is.  
> 
> Hi Jesper,
> 
> From the performance data, there is only a slightly impact. Do we still need
> to block the whole patch on this? Or if you have a better solution?

I'm basically just asking you to add an unlikely() annotation:

	map = READ_ONCE(ri->map);
	if (unlikely(map)) {
		WRITE_ONCE(ri->map, NULL);
		err = dev_map_enqueue_multi(xdp, dev, map, [...]

For XDP, performance is the single most important factor!  You say your
performance data, there is only a slightly impact, there must be ZERO
impact (when your added features is not in use).

You data:
 Version          | Test                                | Generic | Native
 5.12 rc4         | redirect_map        i40e->i40e      |    1.9M |  9.6M
 5.12 rc4 + patch | redirect_map        i40e->i40e      |    1.9M |  9.3M

The performance difference 9.6M -> 9.3M is a slowdown of 3.36 nanosec.
Bjørn and others have been working really hard to optimize the code and
remove down to 1.5 nanosec overheads.  Thus, introducing 3.36 nanosec
added overhead to the fast-path is significant.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux