On Sun, Feb 16, 2025 at 2:01 AM Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote: > > Jason Xing wrote: > > On Sat, Feb 15, 2025 at 11:10 PM Willem de Bruijn > > <willemdebruijn.kernel@xxxxxxxxx> wrote: > > > > > > Jason Xing wrote: > > > > Add the bpf_sock_ops_enable_tx_tstamp kfunc to allow BPF programs to > > > > selectively enable TX timestamping on a skb during tcp_sendmsg(). > > > > > > > > For example, BPF program will limit tracking X numbers of packets > > > > and then will stop there instead of tracing all the sendmsgs of > > > > matched flow all along. It would be helpful for users who cannot > > > > afford to calculate latencies from every sendmsg call probably > > > > due to the performance or storage space consideration. > > > > > > > > Signed-off-by: Jason Xing <kerneljasonxing@xxxxxxxxx> > > > > --- > > > > kernel/bpf/btf.c | 1 + > > > > net/core/filter.c | 33 ++++++++++++++++++++++++++++++++- > > > > 2 files changed, 33 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > > > index 9433b6467bbe..740210f883dc 100644 > > > > --- a/kernel/bpf/btf.c > > > > +++ b/kernel/bpf/btf.c > > > > @@ -8522,6 +8522,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type) > > > > case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: > > > > case BPF_PROG_TYPE_CGROUP_SOCKOPT: > > > > case BPF_PROG_TYPE_CGROUP_SYSCTL: > > > > + case BPF_PROG_TYPE_SOCK_OPS: > > > > return BTF_KFUNC_HOOK_CGROUP; > > > > case BPF_PROG_TYPE_SCHED_ACT: > > > > return BTF_KFUNC_HOOK_SCHED_ACT; > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > index 7f56d0bbeb00..3b4c1e7b1470 100644 > > > > --- a/net/core/filter.c > > > > +++ b/net/core/filter.c > > > > @@ -12102,6 +12102,27 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > > > #endif > > > > } > > > > > > > > +__bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops, > > > > + u64 flags) > > > > +{ > > > > + struct sk_buff *skb; > > > > + struct sock *sk; > > > > + > > > > + if (skops->op != BPF_SOCK_OPS_TS_SND_CB) > > > > + return -EOPNOTSUPP; > > > > + > > > > + if (flags) > > > > + return -EINVAL; > > > > + > > > > + skb = skops->skb; > > > > + sk = skops->sk; > > > > > > nit: not used > > > > BPF programs can use this in the future if necessary whereas the > > selftests don't reflect it. > > How does defining a local variable help there? Sorry, I didn't state it clearly. I meant you're right, for now it is useless, but for the future... Right, I will remove it. > > > > > > > > + skb_shinfo(skb)->tx_flags |= SKBTX_BPF; > > > > + TCP_SKB_CB(skb)->txstamp_ack |= TSTAMP_ACK_BPF; > > > > + skb_shinfo(skb)->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1; > > > > > > Can this overwrite the seqno previously calculated by tcp_tx_timestamp? > > > > seqno? If you are referring to seqno, I don't think the BPF program is > > allowed to modify it because SOCK_OPS_GET_OR_SET_FIELD() only supports > > overwriting sk_txhash only. Please see sock_ops_convert_ctx_access(). > > I meant tskey It 'overwrites' the tskey here if the socket timestamping feature is also on. But the seqno and len would not change during the gap between tcp_tx_timestamp() and bpf_sock_ops_enable_tx_tstamp(), I think? If the seq and len doesn't change, then the tskey will not be truly overwritten with a different value. Unless you probably expect to see this: if (!skb_shinfo(skb)->tskey) skb_shinfo(skb)->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1; ?