Jason Xing wrote: > This patch introduces a new callback in tcp_tx_timestamp() to correlate > tcp_sendmsg timestamp with timestamps from other tx timestamping > callbacks (e.g., SND/SW/ACK). > > Without this patch, BPF program wouldn't know which timestamps belong > to which flow because of no socket lock protection. This new callback > is inserted in tcp_tx_timestamp() to address this issue because > tcp_tx_timestamp() still owns the same socket lock with > tcp_sendmsg_locked() in the meanwhile tcp_tx_timestamp() initializes > the timestamping related fields for the skb, especially tskey. The > tskey is the bridge to do the correlation. > > For TCP, BPF program hooks the beginning of tcp_sendmsg_locked() and > then stores the sendmsg timestamp at the bpf_sk_storage, correlating > this timestamp with its tskey that are later used in other sending > timestamping callbacks. > > Signed-off-by: Jason Xing <kerneljasonxing@xxxxxxxxx> > --- > include/uapi/linux/bpf.h | 5 +++++ > net/ipv4/tcp.c | 4 ++++ > tools/include/uapi/linux/bpf.h | 5 +++++ > 3 files changed, 14 insertions(+) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 9355d617767f..86fca729fbd8 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -7052,6 +7052,11 @@ enum { > * when SK_BPF_CB_TX_TIMESTAMPING > * feature is on. > */ > + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall > + * is triggered. It's used to correlate > + * sendmsg timestamp with corresponding > + * tskey. > + */ > }; > > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index 12b9c4f9c151..4b9739cd3bc5 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -492,6 +492,10 @@ static void tcp_tx_timestamp(struct sock *sk, struct sockcm_cookie *sockc) > if (tsflags & SOF_TIMESTAMPING_TX_RECORD_MASK) > shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1; > } > + > + if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) && > + SK_BPF_CB_FLAG_TEST(sk, SK_BPF_CB_TX_TIMESTAMPING) && skb) > + bpf_skops_tx_timestamping(sk, skb, BPF_SOCK_OPS_TS_SND_CB); > } > > static bool tcp_stream_is_readable(struct sock *sk, int target) > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index d3e2988b3b4c..2739ee0154a0 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -7042,6 +7042,11 @@ enum { > * when SK_BPF_CB_TX_TIMESTAMPING > * feature is on. > */ > + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall > + * is triggered. It's used to correlate > + * sendmsg timestamp with corresponding > + * tskey. > + */ Feel free to decline this late in the review process, but a bit more bikeshedding.. Can we spell out TSTAMP instead of TS in these definitions? Within the context of this series it is self-explanatory, but when reading kernel code the meaning of a two letter acronym is not that clear. And instead of SND can we use SENDMSG or something like that? SND here confused me as the software timestamp is SCM_TSTAMP_SND. For instance: BPF_SOCK_OPS_TSTAMP_SENDMSG_CB, BPF_SOCK_OPS_TSTAMP_SCHED_CB, BPF_SOCK_OPS_TSTAMP_SND_SW_CB, BPF_SOCK_OPS_TSTAMP_SND_HW_CB, (BPF_SOCK_OPS_TSTAMP_TX_COMPLETION_CB,) BPF_SOCK_OPS_TSTAMP_ACK_CB, (not sure what the OPT in OPT_CB added).