Re: [PATCHv9 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support

Hangbin Liu <liuhangbin@xxxxxxxxx> · Sat, 24 Apr 2021 09:09:25 +0800

On Fri, Apr 23, 2021 at 06:54:29PM +0200, Jesper Dangaard Brouer wrote:
> On Thu, 22 Apr 2021 20:02:18 +0200
> Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> 
> > Jesper Dangaard Brouer <brouer@xxxxxxxxxx> writes:
> > 
> > > On Thu, 22 Apr 2021 15:14:52 +0800
> > > Hangbin Liu <liuhangbin@xxxxxxxxx> wrote:
> > >  
> > >> diff --git a/net/core/filter.c b/net/core/filter.c
> > >> index cae56d08a670..afec192c3b21 100644
> > >> --- a/net/core/filter.c
> > >> +++ b/net/core/filter.c  
> > > [...]  
> > >>  int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
> > >>  		    struct bpf_prog *xdp_prog)
> > >>  {
> > >> @@ -3933,6 +3950,7 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
> > >>  	enum bpf_map_type map_type = ri->map_type;
> > >>  	void *fwd = ri->tgt_value;
> > >>  	u32 map_id = ri->map_id;
> > >> +	struct bpf_map *map;
> > >>  	int err;
> > >>  
> > >>  	ri->map_id = 0; /* Valid map id idr range: [1,INT_MAX[ */
> > >> @@ -3942,7 +3960,12 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
> > >>  	case BPF_MAP_TYPE_DEVMAP:
> > >>  		fallthrough;
> > >>  	case BPF_MAP_TYPE_DEVMAP_HASH:
> > >> -		err = dev_map_enqueue(fwd, xdp, dev);
> > >> +		map = xchg(&ri->map, NULL);  
> > >
> > > Hmm, this looks dangerous for performance to have on this fast-path.
> > > The xchg call can be expensive, AFAIK this is an atomic operation.  
> > 
> > Ugh, you're right. That's my bad, I suggested replacing the
> > READ_ONCE()/WRITE_ONCE() pair with the xchg() because an exchange is
> > what it's doing, but I failed to consider the performance implications
> > of the atomic operation. Sorry about that, Hangbin! I guess this should
> > be changed to:
> > 
> > +		map = READ_ONCE(ri->map);
> > +		if (map) {
> > +			WRITE_ONCE(ri->map, NULL);
> > +			err = dev_map_enqueue_multi(xdp, dev, map,
> > +						    ri->flags & BPF_F_EXCLUDE_INGRESS);
> > +		} else {
> > +			err = dev_map_enqueue(fwd, xdp, dev);
> > +		}
> 
> This is highly sensitive fast-path code, as you saw Bjørn have been
> hunting nanosec in this area.  The above code implicitly have "map" as
> the likely option, which I don't think it is.

Hi Jesper,

>From the performance data, there is only a slightly impact. Do we still need
to block the whole patch on this? Or if you have a better solution?

Thanks
Hangbin