Re: [RFC PATCHv2 bpf-next 1/2] xdp: add a new helper for dev map multicast support

Hangbin Liu <liuhangbin@xxxxxxxxx> · Wed, 6 May 2020 17:35:57 +0800

Hi Lorenzo,

Thanks for the comments, please see replies below.

On Fri, Apr 24, 2020 at 04:19:08PM +0200, Lorenzo Bianconi wrote:
> > +bool dev_in_exclude_map(struct bpf_dtab_netdev *obj, struct bpf_map *map,
> > +			int exclude_ifindex)
> > +{
> > +	struct bpf_dtab_netdev *in_obj = NULL;
> > +	u32 key, next_key;
> > +	int err;
> > +
> > +	if (!map)
> > +		return false;
> 
> doing so it seems mandatory to define an exclude_map even if we want just to do
> not forward the packet to the "ingress" interface.
> Moreover I was thinking that we can assume to never forward to in the incoming
> interface. Doing so the code would be simpler I guess. Is there a use case for
> it? (forward even to the ingress interface)

Eelco has help answered one use case: VEPA. Another reason I added this flag
is that the other syscalls like bpf_redirect() or bpf_redirect_map() are
also able to forward to ingress interface. So we need to behave the same
by default.
> 
> > +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
> > +			  struct bpf_map *map, struct bpf_map *ex_map,
> > +			  bool exclude_ingress)
> > +{

[...]
> > +	}
> 
> Do we need to free 'incoming' xdp buffer here? I think most of the drivers assume
> the packet is owned by the stack if xdp_do_redirect returns 0

Yes, we need. I will fix it.
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 7d6ceaa54d21..94d1530e5ac6 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3473,12 +3473,17 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = {
> >  };
> >  
> >  static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
> > -			    struct bpf_map *map, struct xdp_buff *xdp)
> > +			    struct bpf_map *map, struct xdp_buff *xdp,
> > +			    struct bpf_map *ex_map, bool exclude_ingress)
> >  {
> >  	switch (map->map_type) {
> >  	case BPF_MAP_TYPE_DEVMAP:
> >  	case BPF_MAP_TYPE_DEVMAP_HASH:
> > -		return dev_map_enqueue(fwd, xdp, dev_rx);
> > +		if (fwd)
> > +			return dev_map_enqueue(fwd, xdp, dev_rx);
> > +		else
> > +			return dev_map_enqueue_multi(xdp, dev_rx, map, ex_map,
> > +						     exclude_ingress);
> 
> I guess it would be better to do not make it the default case. Maybe you can
> add a bit in flags to mark it for "multicast"

But how do we distinguish the flag bit with other syscalls? e.g. If we define
0x02 as the "do_multicast" flag. What if other syscalls also used this flag.

Currently __bpf_tx_xdp_map() is only called by xdp_do_redirect(). If there
is a map and no fwd, it must be multicast forward. So we are still safe now.
Maybe we need an update in future.

Thanks
Hangbin