From: Martin KaFai Lau <martin.lau@xxxxxxxxx> Date: Tue, 17 Oct 2023 17:54:53 -0700 > On 10/13/23 3:04 PM, Kuniyuki Iwashima wrote: > > This patch adds a new SOCK_OPS hook to generate arbitrary SYN Cookie. > > > > When the kernel sends SYN Cookie to a client, the hook is invoked with > > bpf_sock_ops.op == BPF_SOCK_OPS_GEN_SYNCOOKIE_CB if the listener has > > BPF_SOCK_OPS_SYNCOOKIE_CB_FLAG set by bpf_sock_ops_cb_flags_set(). > > > > The BPF program can access the following information to encode into > > ISN: > > > > bpf_sock_ops.sk : 4-tuple > > bpf_sock_ops.skb : TCP header > > bpf_sock_ops.args[0] : MSS > > > > The program must encode MSS and set it to bpf_sock_ops.replylong[0], > > which will be looped back to the paired hook added in the following > > patch. > > > > Note that we do not call tcp_synq_overflow() so that the BPF program > > can set its own expiration period. > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> > > --- > > include/uapi/linux/bpf.h | 18 +++++++++++++++- > > net/ipv4/tcp_input.c | 38 +++++++++++++++++++++++++++++++++- > > tools/include/uapi/linux/bpf.h | 18 +++++++++++++++- > > 3 files changed, 71 insertions(+), 3 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index 7ba61b75bc0e..d3cc530613c0 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -6738,8 +6738,17 @@ enum { > > * options first before the BPF program does. > > */ > > BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG = (1<<6), > > + /* Call bpf when the kernel generates SYN Cookie (ISN) for SYN+ACK. > > + * > > + * The bpf prog will be called to encode MSS into SYN Cookie with > > + * sock_ops->op == BPF_SOCK_OPS_GEN_SYNCOOKIE_CB. > > + * > > + * Please refer to the comment in BPF_SOCK_OPS_GEN_SYNCOOKIE_CB for > > + * input and output. > > + */ > > + BPF_SOCK_OPS_SYNCOOKIE_CB_FLAG = (1<<7), > > /* Mask of all currently supported cb flags */ > > - BPF_SOCK_OPS_ALL_CB_FLAGS = 0x7F, > > + BPF_SOCK_OPS_ALL_CB_FLAGS = 0xFF, > > }; > > > > /* List of known BPF sock_ops operators. > > @@ -6852,6 +6861,13 @@ enum { > > * by the kernel or the > > * earlier bpf-progs. > > */ > > + BPF_SOCK_OPS_GEN_SYNCOOKIE_CB, /* Generate SYN Cookie (ISN of > > + * SYN+ACK). > > + * > > + * args[0]: MSS > > + * > > + * replylong[0]: ISN > > + */ > > }; > > > > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index 584825ddd0a0..c86a737e4fe6 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -6966,6 +6966,37 @@ u16 tcp_get_syncookie_mss(struct request_sock_ops *rsk_ops, > > } > > EXPORT_SYMBOL_GPL(tcp_get_syncookie_mss); > > > > +#if IS_ENABLED(CONFIG_CGROUP_BPF) && IS_ENABLED(CONFIG_SYN_COOKIES) > > +static int bpf_skops_cookie_init_sequence(struct sock *sk, struct request_sock *req, > > + struct sk_buff *skb, __u32 *isn) > > +{ > > + struct bpf_sock_ops_kern sock_ops; > > + int ret; > > + > > + memset(&sock_ops, 0, offsetof(struct bpf_sock_ops_kern, temp)); > > + > > + sock_ops.op = BPF_SOCK_OPS_GEN_SYNCOOKIE_CB; > > + sock_ops.sk = req_to_sk(req); > > + sock_ops.args[0] = req->mss; > > + > > + bpf_skops_init_skb(&sock_ops, skb, tcp_hdrlen(skb)); > > + > > + ret = BPF_CGROUP_RUN_PROG_SOCK_OPS_SK(&sock_ops, sk); > > + if (ret) > > + return ret; > > + > > + *isn = sock_ops.replylong[0]; > > sock_ops.{replylong,reply} cannot be used. afaik, no existing sockops hook > relies on {replylong,reply}. It is a union of args[4]. There could be a few > skops bpf in the same cgrp and each of them will be run one after another. (eg. > two skops progs want to generate cookie). Ah, I missed that case. Looking at bpf_prog_run_array_cg(), multiple SOCK_OPS prog can be attached and args[] are reused. Then, we cannot use replylong[] for interface from bpf prog. > > I don't prefer to extend the uapi 'struct bpf_sock_ops' and then the > sock_ops_convert_ctx_access(). Adding member to the kernel 'struct > bpf_sock_addr_kern' could still be considered if it is really needed. > > One option is to add kfunc to allow the bpf prog to directly update the value of > the kernel obj (e.g. tcp_rsk(req)->snt_isn here). Yes, we need to set snt_isn, mss, sack_ok etc based on _CB (if we continue with SOCK_OPS). > > Also, we need to allow a bpf prog to selectively generate custom cookie for one > SYN but fall-through to the kernel cookie for another SYN. Initially I implemented the fallback but the validation hook looked bit ugly (because of reqsk allocation) and removed the fallback flow. Also, I thought it can be done with other hooks so that such SYN will be distributed to another listener.