On 6/21/19 5:51 AM, Florian Westphal wrote: > Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> wrote: >>> So, at least for this part I don't see a technical reason why this >>> has to grab a reference for listener socket. >> >> That's helpful, thanks! We rely on TPROXY, so I would like to help with >> that. Let me see if I can get time to work on it. > > AFAICS so far this would be enough: > > 1. remove the BUG_ON() in skb_orphan, letting it clear skb->sk instead > 2. in nf_queue_entry_get_refs(), if skb->sk and no destructor: > call nf_tproxy_assign_sock() so a reference gets taken. > 3. change skb_steal_sock: > static inline struct sock *skb_steal_sock(struct sk_buff *skb, bool *refcounted) > [..] > *refcounted = skb->destructor != NULL; > 4. make tproxy sk assign elide the destructor assigment in case of > a listening sk. > Okay, but how do we make sure the skb->sk association does not leak from rcu section ? Note we have the noref/refcounted magic for skb_dst(), we might try to use something similar for skb->sk > This should work because TPROXY target is restricted to PRE_ROUTING, and > __netif_receive_skb_core runs with rcu readlock already held. > > On a side note, it would also be interesting to see what breaks if the > nf_tproxy_sk_is_transparent() check in the tprox eval function is > removed -- if we need the transparent:1 marker only for output, i think > it would be ok to raise the bit transparently in the kernel in case > we assign skb->sk = found_sk; i.e. > if (unlikely(!sk_is_transparent(sk)) > make_sk_transparent(sk); > > I don't see a reason why we need the explicit setsockopt(IP_TRANSPARENT) > from userspace. >