On Wed, Mar 31, 2021 at 03:41:17PM +0200, Toke Høiland-Jørgensen wrote: > > @@ -1491,13 +1492,20 @@ static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind > > */ > > ri->map_id = INT_MAX; /* Valid map id idr range: [1,INT_MAX[ */ > > ri->map_type = BPF_MAP_TYPE_UNSPEC; > > - return flags; > > + return flags & BPF_F_ACTION_MASK; > > } > > > > ri->tgt_index = ifindex; > > ri->map_id = map->id; > > ri->map_type = map->map_type; > > > > + if ((map->map_type == BPF_MAP_TYPE_DEVMAP || > > + map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) && > > + (flags & BPF_F_BROADCAST)) { > > + ri->flags = flags; > > This, combined with this: > > [...] > > > + if (ri->flags & BPF_F_BROADCAST) { > > + map = READ_ONCE(ri->map); > > + WRITE_ONCE(ri->map, NULL); > > + } > > + > > switch (map_type) { > > case BPF_MAP_TYPE_DEVMAP: > > fallthrough; > > case BPF_MAP_TYPE_DEVMAP_HASH: > > - err = dev_map_enqueue(fwd, xdp, dev); > > + if (ri->flags & BPF_F_BROADCAST) > > + err = dev_map_enqueue_multi(xdp, dev, map, > > + ri->flags & BPF_F_EXCLUDE_INGRESS); > > + else > > + err = dev_map_enqueue(fwd, xdp, dev); > > Means that (since the flags value is never cleared again) once someone > has done a broadcast redirect, that's all they'll ever get until the > next reboot ;) How about just get the ri->flags first and clean it directly. e.g. flags = ri->flags; ri->flags = 0; With this we don't need to add an extra field ri->exclude_ingress as you mentioned later. People may also need the flags field in future. > > Also, the bpf_clear_redirect_map() call has no effect since the call to > dev_map_enqueue_multi() only checks the flags and not the value of the > map pointer before deciding which enqueue function to call. > > To fix both of these, how about changing the logic so that: > > - __bpf_xdp_redirect_map() sets the map pointer if the broadcast flag is > set (and clears it if the flag isn't set!) OK > > - xdp_do_redirect() does the READ_ONCE/WRITE_ONCE on ri->map > unconditionally and then dispatches to dev_map_enqueue_multi() if the > read resulted in a non-NULL pointer > > Also, it should be invalid to set the broadcast flag with a map type > other than a devmap; __bpf_xdp_redirect_map() should check that. The current code only do if (unlikely(flags > XDP_TX)) and didn't check the map type. I also only set the map when there has devmap + broadcast flag. Do you mean we need a more strict check? like if (unlikely((flags & ~(BPF_F_ACTION_MASK | BPF_F_REDIR_MASK)) || (map->map_type != BPF_MAP_TYPE_DEVMAP && map->map_type != BPF_MAP_TYPE_DEVMAP_HASH && flags & BPF_F_REDIR_MASK))) Thanks Hangbin > > And finally, with the changes above, you no longer need the broadcast > flag in do_redirect() at all, so you could just have a bool > ri->exclude_ingress that is set in the helper and pass that directly to > dev_map_enqueue_multi(). > > A selftest that validates that the above works as it's supposed to might > be nice as well (i.e., one that broadcasts and does a regular redirect > after one another) > > -Toke >