On Tue, Mar 19, 2019 at 10:20:59AM +0000, Lorenz Bauer wrote: > Using bpf_skc_lookup_tcp it's possible to ascertain whether a packet > belongs to a known connection. However, there is one corner case: no > sockets are created if SYN cookies are active. This means that the final > ACK in the 3WHS is misclassified. > > Using the helper, we can look up the listening socket via > bpf_skc_lookup_tcp and then check whether a packet is a valid SYN > cookie ACK. > > Signed-off-by: Lorenz Bauer <lmb@xxxxxxxxxxxxxx> > --- > include/uapi/linux/bpf.h | 18 +++++++++- > net/core/filter.c | 72 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 89 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 8e4f8276942a..587d7a3295bf 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2391,6 +2391,21 @@ union bpf_attr { > * Pointer to **struct bpf_sock**, or **NULL** in case of failure. > * For sockets with reuseport option, the **struct bpf_sock** > * result is from **reuse->socks**\ [] using the hash of the tuple. > + * > + * int bpf_tcp_check_syncookie(struct bpf_sock *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len) > + * Description > + * Check whether iph and th contain a valid SYN cookie ACK for > + * the listening socket in sk. > + * > + * iph points to the start of the IPv4 or IPv6 header, while > + * iph_len contains sizeof(struct iphdr) or sizeof(struct ip6hdr). > + * > + * th points to the start of the TCP header, while th_len contains > + * sizeof(struct tcphdr). > + * > + * Return > + * 0 if iph and th are a valid SYN cookie ACK, or a negative error > + * otherwise. > */ > #define __BPF_FUNC_MAPPER(FN) \ > FN(unspec), \ > @@ -2492,7 +2507,8 @@ union bpf_attr { > FN(tcp_sock), \ > FN(skb_ecn_set_ce), \ > FN(get_listener_sock), \ > - FN(skc_lookup_tcp), > + FN(skc_lookup_tcp), \ > + FN(tcp_check_syncookie), > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > * function eBPF program intends to call > diff --git a/net/core/filter.c b/net/core/filter.c > index f5210773cfd8..45a46fbe4ee0 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -5546,6 +5546,74 @@ static const struct bpf_func_proto bpf_skb_ecn_set_ce_proto = { > .ret_type = RET_INTEGER, > .arg1_type = ARG_PTR_TO_CTX, > }; > + > +BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len, > + struct tcphdr *, th, u32, th_len) what was the reason to pass ip and tcp header pointers separately? Why not to pass 'void* iph' with len long enough to cover both? the helper can compute tcp header and can take into account variable length ipv4 hdr while checking that resulting ptr is within 'iph + len' validated by the verifier. Last patch example is buggy in that sense, since it's doing: tcph = data + sizeof(struct ethhdr) + sizeof(struct iphdr); If we drop two args then there will be room for 'flags'. It can be used for example to indicate whether LINUX_MIB_SYNCOOKIESFAILED should be incremented or not. May be it shold be unconditionally? > +{ > +#ifdef CONFIG_SYN_COOKIES > + u32 cookie; > + int ret; > + > + if (unlikely(th_len < sizeof(*th))) > + return -EINVAL; > + > + /* sk_listener() allows TCP_NEW_SYN_RECV, which makes no sense here. */ > + if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN) > + return -EINVAL; > + > + if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies) > + return -EINVAL; > + > + if (!th->ack || th->rst || th->syn) > + return -ENOENT; > + > + if (tcp_synq_no_recent_overflow(sk)) > + return -ENOENT; > + > + cookie = ntohl(th->ack_seq) - 1; > + > + switch (sk->sk_family) { > + case AF_INET: > + if (unlikely(iph_len < sizeof(struct iphdr))) > + return -EINVAL; > + > + ret = __cookie_v4_check((struct iphdr *)iph, th, cookie); > + break; > + > +#if IS_BUILTIN(CONFIG_IPV6) > + case AF_INET6: > + if (unlikely(iph_len < sizeof(struct ipv6hdr))) > + return -EINVAL; > + > + ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie); I guess this is fast path and you don't want indirect call via ipv6_bpf_stub ?