Hi, On Fri, 2019-06-21 at 18:41 +0200, Florian Westphal wrote: > Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > > > AFAICS so far this would be enough: > > > > > > 1. remove the BUG_ON() in skb_orphan, letting it clear skb->sk instead > > > 2. in nf_queue_entry_get_refs(), if skb->sk and no destructor: > > > call nf_tproxy_assign_sock() so a reference gets taken. > > > 3. change skb_steal_sock: > > > static inline struct sock *skb_steal_sock(struct sk_buff *skb, bool *refcounted) > > > [..] > > > *refcounted = skb->destructor != NULL; > > > 4. make tproxy sk assign elide the destructor assigment in case of > > > a listening sk. > > > > > > > Okay, but how do we make sure the skb->sk association does not leak from rcu section ? > > From netfilter pov the only escape point is nfqueue (and kfree_skb), > so for tcp/udp it will end up in their respective rx path eventually. > But you are right in that we need to also audit all NF_STOLEN users that > can be invoked from PRE_ROUTING and INPUT hooks. > > OUTPUT/FORWARD/POSTROUTING are not relevant, in case skb enters IP forwarding, > it will be dropped there (we have a check to toss skb with socket > attached in forward). > > In recent hallway discussion Eric suggested to add a empty destructor > stub, it would allow to do the needed annotation, i.e. > no need to change skb_orphan(), *refcounted would be set via > skb->destructor != noref_listen_skb_destructor check. Perhaps I'm misreading the above, but it looks like this has some overlapping with a past attempt: https://marc.info/?l=linux-netdev&m=150611442802964&w=2 Cheers, Paolo