On Sat, May 18, 2019 at 7:08 PM Martin Lau <kafai@xxxxxx> wrote: > > On Sat, May 18, 2019 at 06:52:48PM -0700, Joe Stringer wrote: > > On Sat, May 18, 2019, 09:05 Martin Lau <kafai@xxxxxx> wrote: > > > > > > On Sat, May 18, 2019 at 08:38:46AM -1000, Joe Stringer wrote: > > > > On Fri, May 17, 2019, 12:02 Martin Lau <kafai@xxxxxx> wrote: > > > > > > > > > On Fri, May 17, 2019 at 02:51:48PM -0700, Eric Dumazet wrote: > > > > > > > > > > > > > > > > > > On 5/17/19 2:21 PM, Martin KaFai Lau wrote: > > > > > > > The BPF_FUNC_sk_lookup_xxx helpers return RET_PTR_TO_SOCKET_OR_NULL. > > > > > > > Meaning a fullsock ptr and its fullsock's fields in bpf_sock can be > > > > > > > accessed, e.g. type, protocol, mark and priority. > > > > > > > Some new helper, like bpf_sk_storage_get(), also expects > > > > > > > ARG_PTR_TO_SOCKET is a fullsock. > > > > > > > > > > > > > > bpf_sk_lookup() currently calls sk_to_full_sk() before returning. > > > > > > > However, the ptr returned from sk_to_full_sk() is not guaranteed > > > > > > > to be a fullsock. For example, it cannot get a fullsock if sk > > > > > > > is in TCP_TIME_WAIT. > > > > > > > > > > > > > > This patch checks for sk_fullsock() before returning. If it is not > > > > > > > a fullsock, sock_gen_put() is called if needed and then returns NULL. > > > > > > > > > > > > > > Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF") > > > > > > > Cc: Joe Stringer <joe@xxxxxxxxxxxxx> > > > > > > > Signed-off-by: Martin KaFai Lau <kafai@xxxxxx> > > > > > > > --- > > > > > > > net/core/filter.c | 16 ++++++++++++++-- > > > > > > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > > > > index 55bfc941d17a..85def5a20aaf 100644 > > > > > > > --- a/net/core/filter.c > > > > > > > +++ b/net/core/filter.c > > > > > > > @@ -5337,8 +5337,14 @@ __bpf_sk_lookup(struct sk_buff *skb, struct > > > > > bpf_sock_tuple *tuple, u32 len, > > > > > > > struct sock *sk = __bpf_skc_lookup(skb, tuple, len, caller_net, > > > > > > > ifindex, proto, netns_id, > > > > > flags); > > > > > > > > > > > > > > - if (sk) > > > > > > > + if (sk) { > > > > > > > sk = sk_to_full_sk(sk); > > > > > > > + if (!sk_fullsock(sk)) { > > > > > > > + if (!sock_flag(sk, SOCK_RCU_FREE)) > > > > > > > + sock_gen_put(sk); > > > > > > > > > > > > This looks a bit convoluted/weird. > > > > > > > > > > > > What about telling/asking __bpf_skc_lookup() to not return a non > > > > > fullsock instead ? > > > > > It is becausee some other helpers, like BPF_FUNC_skc_lookup_tcp, > > > > > can return non fullsock > > > > > > > > > > > > > FYI this is necessary for finding a transparently proxied socket for a > > > > non-local connection (tproxy use case). > > > You meant it is necessary to return a non fullsock from the > > > BPF_FUNC_sk_lookup_xxx helpers? > > > > Yes, that's what I want to associate with the skb so that the delivery > > to the SO_TRANSPARENT is received properly. > > > > For the first packet of a connection, we look up the socket using the > > tproxy socket port as the destination, and deliver the packet there. > > The SO_TRANSPARENT logic then kicks in and sends back the ack and > > creates the non-full sock for the connection tuple, which can be > > entirely unrelated to local addresses or ports. > > > > For the second forward-direction packet, (ie ACK in 3-way handshake) > > then we must deliver the packet to this non-full sock as that's what > > is negotiating the proxied connection. If you look up using the packet > > tuple then get the full sock from it, it will go back to the > > SO_TRANSPARENT parent socket. Delivering the ACK there will result in > > a RST being sent back, because the SO_TRANSPARENT socket is just there > > to accept new connections for connections to be proxied. So this is > > the case where I need the non-full sock. > > > > (In practice, the lookup logic attempts the packet tuple first then if > > that fails, uses the tproxy port for lookup to achieve the above). > hmm...I am likely missing something. > > 1) The above can be done by the "BPF_FUNC_skC_lookup_tcp" which > returns a non fullsock (RET_PTR_TO_SOCK_COMMON_OR_NULL), no? Correct, I meant to send as response to Eric as to use cases for __bpf_skc_lookup() returning non fullsock. > 2) The bpf_func_proto of "BPF_FUNC_sk_lookup_tcp" returns > fullsock (RET_PTR_TO_SOCKET_OR_NULL) and the bpf_prog (and > the verifier) is expecting that. How to address the bug here? Your proposal seems fine to me.