On Wed, Feb 5, 2025 at 2:31 AM Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > Introduce the callback to correlate tcp_sendmsg timestamp with other > points, like SND/SW/ACK. let bpf prog trace the beginning of > tcp_sendmsg_locked() and then store the sendmsg timestamp at > the bpf_sk_storage, so that in tcp_tx_timestamp() we can correlate > the timestamp with tskey which can be found in other sending points. > > More details can be found in the selftest: > The selftest uses the bpf_sk_storage to store the sendmsg timestamp at > fentry/tcp_sendmsg_locked and retrieves it back at tcp_tx_timestamp > (i.e. BPF_SOCK_OPS_TS_SND_CB added in this patch). > > Signed-off-by: Jason Xing <kerneljasonxing@xxxxxxxxx> > --- > include/uapi/linux/bpf.h | 7 +++++++ > net/ipv4/tcp.c | 1 + > tools/include/uapi/linux/bpf.h | 7 +++++++ > 3 files changed, 15 insertions(+) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 800122a8abe5..accb3b314fff 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -7052,6 +7052,13 @@ enum { > * when SK_BPF_CB_TX_TIMESTAMPING > * feature is on. > */ > + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall > + * is triggered. For TCP, it stays > + * in the last send process to > + * correlate with tcp_sendmsg timestamp > + * with other timestamping callbacks, > + * like SND/SW/ACK. > + */ > }; In case the use of the new flag is buried in many threads, I decide to rephrase here to manifest how UDP would use it: 1. introduce a field ts_opt_id_bpf which works like ts_opt_id[1] to allow the bpf program to fully take control of the management of tskey. 2. use fentry hook udp_sendmsg(), and introduce a callback function like BPF_SOCK_OPS_TIMEOUT_INIT in kernel to initialize the ts_opt_id_bpf with tskey that bpf prog generates. We can directly use BPF_SOCK_OPS_TS_SND_CB. 3. modify the SCM_TS_OPT_ID logic to support bpf extension so that the newly added field ts_opt_id_bpf can be passed to the skb_shinfo(skb)->tskey in __ip_append_data(). In this way, this approach can also be extended for other protocols. [1] commit 4aecca4c76808f3736056d18ff510df80424bc9f Author: Vadim Fedorenko <vadim.fedorenko@xxxxxxxxx> Date: Tue Oct 1 05:57:14 2024 -0700 net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX timestamps and packets sent via socket. Unfortunately, there is no way to reliably predict socket timestamp ID value in case of error returned by sendmsg. For UDP sockets it's impossible because of lockless nature of UDP transmit, several threads may send packets in parallel. In case of RAW sockets MSG_MORE option makes things complicated. More details are in the conversation [1]. This patch adds new control message type to give user-space software an opportunity to control the mapping between packets and values by providing ID with each sendmsg for UDP sockets. The documentation is also added in this patch. [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@xxxxxxxxxxxxxx/ Thanks, Jason > > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index 3df802410ebf..a2ac57543b6d 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -501,6 +501,7 @@ static void tcp_tx_timestamp(struct sock *sk, struct sockcm_cookie *sockc) > tcb->txstamp_ack_bpf = 1; > shinfo->tx_flags |= SKBTX_BPF; > shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1; > + bpf_skops_tx_timestamping(sk, skb, BPF_SOCK_OPS_TS_SND_CB); > } > } > > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index 06e68d772989..384502996cdd 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -7045,6 +7045,13 @@ enum { > * when SK_BPF_CB_TX_TIMESTAMPING > * feature is on. > */ > + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall > + * is triggered. For TCP, it stays > + * in the last send process to > + * correlate with tcp_sendmsg timestamp > + * with other timestamping callbacks, > + * like SND/SW/ACK. > + */ > }; > > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect > -- > 2.43.5 >