Hi Lorenzo, Thanks for the comments, please see replies below. On Fri, Apr 24, 2020 at 04:19:08PM +0200, Lorenzo Bianconi wrote: > > +bool dev_in_exclude_map(struct bpf_dtab_netdev *obj, struct bpf_map *map, > > + int exclude_ifindex) > > +{ > > + struct bpf_dtab_netdev *in_obj = NULL; > > + u32 key, next_key; > > + int err; > > + > > + if (!map) > > + return false; > > doing so it seems mandatory to define an exclude_map even if we want just to do > not forward the packet to the "ingress" interface. > Moreover I was thinking that we can assume to never forward to in the incoming > interface. Doing so the code would be simpler I guess. Is there a use case for > it? (forward even to the ingress interface) Eelco has help answered one use case: VEPA. Another reason I added this flag is that the other syscalls like bpf_redirect() or bpf_redirect_map() are also able to forward to ingress interface. So we need to behave the same by default. > > > +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx, > > + struct bpf_map *map, struct bpf_map *ex_map, > > + bool exclude_ingress) > > +{ [...] > > + } > > Do we need to free 'incoming' xdp buffer here? I think most of the drivers assume > the packet is owned by the stack if xdp_do_redirect returns 0 Yes, we need. I will fix it. > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 7d6ceaa54d21..94d1530e5ac6 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -3473,12 +3473,17 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = { > > }; > > > > static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, > > - struct bpf_map *map, struct xdp_buff *xdp) > > + struct bpf_map *map, struct xdp_buff *xdp, > > + struct bpf_map *ex_map, bool exclude_ingress) > > { > > switch (map->map_type) { > > case BPF_MAP_TYPE_DEVMAP: > > case BPF_MAP_TYPE_DEVMAP_HASH: > > - return dev_map_enqueue(fwd, xdp, dev_rx); > > + if (fwd) > > + return dev_map_enqueue(fwd, xdp, dev_rx); > > + else > > + return dev_map_enqueue_multi(xdp, dev_rx, map, ex_map, > > + exclude_ingress); > > I guess it would be better to do not make it the default case. Maybe you can > add a bit in flags to mark it for "multicast" But how do we distinguish the flag bit with other syscalls? e.g. If we define 0x02 as the "do_multicast" flag. What if other syscalls also used this flag. Currently __bpf_tx_xdp_map() is only called by xdp_do_redirect(). If there is a map and no fwd, it must be multicast forward. So we are still safe now. Maybe we need an update in future. Thanks Hangbin