On Mon, Feb 28, 2022 at 3:41 PM Florian Westphal <fw@xxxxxxxxx> wrote: > > Joe Stringer <joe@xxxxxxxxx> wrote: > > On Mon, Feb 28, 2022 at 8:29 AM Florian Westphal <fw@xxxxxxxxx> wrote: > > > + if (skb_sk_is_prefetched(skb)) { > > > + struct sock *sk = skb->sk; > > > + > > > + if (!sk_is_refcounted(sk)) { > > > + if (!refcount_inc_not_zero(&sk->sk_refcnt)) > > > + return -ENOTCONN; > > > + > > > + /* drop refcount on skb_orphan */ > > > + skb->destructor = sock_edemux; > > > + } > > > + } > > > + > > > entry = kmalloc(sizeof(*entry) + route_key_size, GFP_ATOMIC); > > > if (!entry) > > > return -ENOMEM; > > > > I've never heard of someone trying to use socket prefetch / > > bpf_sk_assign in conjunction with nf_queue, bit of an unusual case. > > Me neither, but if someone does it, skb->sk leaves rcu protection. > > > Given that `skb_sk_is_prefetched()` relies on the skb->destructor > > pointing towards sock_pfree, and this code would change that to > > sock_edemux, the difference the patch would make is this: if the > > packet is queued and then accepted, the socket prefetch selection > > could be ignored. > > Hmmm, wait a second, is that because of orphan in input path, i.e., > that this preselect has to work even across veth/netns crossing? Yes it's linked to the orphan in input path which otherwise breaks the feature inside the same netns. I'm not sure I follow the question about veth/netns crossing as this feature can only be used on TC ingress today. It just prefetches the socket as the packet is received up the stack from a device.