Re: [PATCH net-next v2 06/12] net-timestamp: introduce TS_SCHED_OPT_CB to generate dev xmit timestamp

Martin KaFai Lau <martin.lau@xxxxxxxxx> · Tue, 15 Oct 2024 18:01:23 -0700

On 10/11/24 9:06 PM, Jason Xing wrote:
From: Jason Xing <kernelxing@xxxxxxxxxxx>

Introduce BPF_SOCK_OPS_TS_SCHED_OPT_CB flag so that we can decide to
print timestamps when the skb just passes the dev layer.

Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx>
---
  include/uapi/linux/bpf.h       |  5 +++++
  net/core/skbuff.c              | 17 +++++++++++++++--
  tools/include/uapi/linux/bpf.h |  5 +++++
  3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 157e139ed6fc..3cf3c9c896c7 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7019,6 +7019,11 @@ enum {
  					 * by the kernel or the
  					 * earlier bpf-progs.
  					 */
+	BPF_SOCK_OPS_TS_SCHED_OPT_CB,	/* Called when skb is passing through
+					 * dev layer when SO_TIMESTAMPING
+					 * feature is on. It indicates the
+					 * recorded timestamp.
+					 */
  };
  
  /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3a4110d0f983..16e7bdc1eacb 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -5632,8 +5632,21 @@ static void bpf_skb_tstamp_tx_output(struct sock *sk, int tstype)
  		return;
  
  	tp = tcp_sk(sk);
-	if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG))
-		return;
+	if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG)) {
+		struct timespec64 tstamp;
+		u32 cb_flag;
+
+		switch (tstype) {
+		case SCM_TSTAMP_SCHED:
+			cb_flag = BPF_SOCK_OPS_TS_SCHED_OPT_CB;
+			break;
+		default:
+			return;
+		}
+
+		tstamp = ktime_to_timespec64(ktime_get_real());
+		tcp_call_bpf_2arg(sk, cb_flag, tstamp.tv_sec, tstamp.tv_nsec);

There is bpf_ktime_get_*() helper. The bpf prog can directly call the 
bpf_ktime_get_* helper and use whatever clock it sees fit instead of enforcing 
real clock here and doing an extra ktime_to_timespec64. Right now the 
bpf_ktime_get_*() does not have real clock which I think it can be added.

I think overall the tstamp reporting interface does not necessarily have to 
follow the socket API. The bpf prog is running in the kernel. It could pass 
other information to the bpf prog if it sees fit. e.g. the bpf prog could also 
get the original transmitted tcp skb if it is useful.

+	}
  }
  
  void __skb_tstamp_tx(struct sk_buff *orig_skb,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 93853d9d4922..d60675e1a5a0 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -7018,6 +7018,11 @@ enum {
  					 * by the kernel or the
  					 * earlier bpf-progs.
  					 */
+	BPF_SOCK_OPS_TS_SCHED_OPT_CB,	/* Called when skb is passing through
+					 * dev layer when SO_TIMESTAMPING
+					 * feature is on. It indicates the
+					 * recorded timestamp.
+					 */
  };
  
  /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect