On Thu, Apr 21, 2022 at 3:09 PM Peilin Ye <yepeilin.cs@xxxxxxxxx> wrote: > > From: Peilin Ye <peilin.ye@xxxxxxxxxxxxx> > > As pointed out by Jakub Kicinski, currently using TUNNEL_SEQ in > collect_md mode is racy for [IP6]GRE[TAP] devices. Consider the > following sequence of events: > > 1. An [IP6]GRE[TAP] device is created in collect_md mode using "ip link > add ... external". "ip" ignores "[o]seq" if "external" is specified, > so TUNNEL_SEQ is off, and the device is marked as NETIF_F_LLTX (i.e. > it uses lockless TX); > 2. Someone sets TUNNEL_SEQ on outgoing skb's, using e.g. > bpf_skb_set_tunnel_key() in an eBPF program attached to this device; > 3. gre_fb_xmit() or __gre6_xmit() processes these skb's: > > gre_build_header(skb, tun_hlen, > flags, protocol, > tunnel_id_to_key32(tun_info->key.tun_id), > (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) > : 0); ^^^^^^^^^^^^^^^^^ > > Since we are not using the TX lock (&txq->_xmit_lock), multiple CPUs may > try to do this tunnel->o_seqno++ in parallel, which is racy. Fix it by > making o_seqno atomic_t. > > As mentioned by Eric Dumazet in commit b790e01aee74 ("ip_gre: lockless > xmit"), making o_seqno atomic_t increases "chance for packets being out > of order at receiver" when NETIF_F_LLTX is on. > > Maybe a better fix would be: > > 1. Do not ignore "oseq" in external mode. Users MUST specify "oseq" if > they want the kernel to allow sequencing of outgoing packets; > 2. Reject all outgoing TUNNEL_SEQ packets if the device was not created > with "oseq". > > Unfortunately, that would break userspace. > > We could now make [IP6]GRE[TAP] devices always NETIF_F_LLTX, but let us > do it in separate patches to keep this fix minimal. > > Suggested-by: Jakub Kicinski <kuba@xxxxxxxxxx> > Fixes: 77a5196a804e ("gre: add sequence number for collect md mode.") > Signed-off-by: Peilin Ye <peilin.ye@xxxxxxxxxxxxx> > --- LGTM Acked-by: William Tu <u9012063@xxxxxxxxx>