Martin KaFai Lau <martin.lau@xxxxxxxxx> writes: > On 4/18/24 12:18 AM, Toke Høiland-Jørgensen wrote: >> When redirecting a packet using XDP, the bpf_redirect_map() helper will set >> up the redirect destination information in struct bpf_redirect_info (using >> the __bpf_xdp_redirect_map() helper function), and the xdp_do_redirect() >> function will read this information after the XDP program returns and pass >> the frame on to the right redirect destination. >> >> When using the BPF_F_BROADCAST flag to do multicast redirect to a whole >> map, __bpf_xdp_redirect_map() sets the 'map' pointer in struct >> bpf_redirect_info to point to the destination map to be broadcast. And >> xdp_do_redirect() reacts to the value of this map pointer to decide whether >> it's dealing with a broadcast or a single-value redirect. However, if the >> destination map is being destroyed before xdp_do_redirect() is called, the >> map pointer will be cleared out (by bpf_clear_redirect_map()) without >> waiting for any XDP programs to stop running. This causes xdp_do_redirect() >> to think that the redirect was to a single target, but the target pointer >> is also NULL (since broadcast redirects don't have a single target), so >> this causes a crash when a NULL pointer is passed to dev_map_enqueue(). >> >> To fix this, change xdp_do_redirect() to react directly to the presence of >> the BPF_F_BROADCAST flag in the 'flags' value in struct bpf_redirect_info >> to disambiguate between a single-target and a broadcast redirect. And only >> read the 'map' pointer if the broadcast flag is set, aborting if that has >> been cleared out in the meantime. This prevents the crash, while keeping >> the atomic (cmpxchg-based) clearing of the map pointer itself, and without >> adding any more checks in the non-broadcast fast path. >> >> Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support") >> Reported-and-tested-by: syzbot+af9492708df9797198d6@xxxxxxxxxxxxxxxxxxxxxxxxx >> Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> >> --- >> net/core/filter.c | 42 ++++++++++++++++++++++++++++++++---------- >> 1 file changed, 32 insertions(+), 10 deletions(-) >> >> diff --git a/net/core/filter.c b/net/core/filter.c >> index 786d792ac816..8120c3dddf5e 100644 >> --- a/net/core/filter.c >> +++ b/net/core/filter.c >> @@ -4363,10 +4363,12 @@ static __always_inline int __xdp_do_redirect_frame(struct bpf_redirect_info *ri, >> enum bpf_map_type map_type = ri->map_type; >> void *fwd = ri->tgt_value; >> u32 map_id = ri->map_id; >> + u32 flags = ri->flags; >> struct bpf_map *map; >> int err; >> >> ri->map_id = 0; /* Valid map id idr range: [1,INT_MAX[ */ >> + ri->flags = 0; >> ri->map_type = BPF_MAP_TYPE_UNSPEC; >> >> if (unlikely(!xdpf)) { >> @@ -4378,11 +4380,20 @@ static __always_inline int __xdp_do_redirect_frame(struct bpf_redirect_info *ri, >> case BPF_MAP_TYPE_DEVMAP: >> fallthrough; >> case BPF_MAP_TYPE_DEVMAP_HASH: >> - map = READ_ONCE(ri->map); >> - if (unlikely(map)) { >> + if (unlikely(flags & BPF_F_BROADCAST)) { >> + map = READ_ONCE(ri->map); >> + >> + /* The map pointer is cleared when the map is being torn >> + * down by bpf_clear_redirect_map() > > Thanks for the details explanation in the commit message. All make sense. Great! > It could be a dumb question. > > From reading the "waits for...NAPI being the relevant context here..." comment > in dev_map_free(), I wonder if moving synchronize_rcu() before > bpf_clear_redirect_map() would also work? Actually, does it need to call > bpf_clear_redirect_map(). The on-going xdp_do_redirect() should be the last one > using the map in ri->map anyway and no xdp prog can set it again to > ri->map. I think we do need to retain the current behaviour, because of the decoupling between the helper and the return code. Otherwise, you could have a program that calls the bpf_redirect_map() helper, but returns a different value (say, XDP_DROP). In this case, the map pointer will stick around in struct bpf_redirect_info, and if a subsequent XDP program then returns XDP_REDIRECT (*without* calling bpf_redirect_map()), it will use the stale pointer value and cause a UAF. -Toke