On Wed, Oct 16, 2024 at 9:01 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 10/11/24 9:06 PM, Jason Xing wrote: > > From: Jason Xing <kernelxing@xxxxxxxxxxx> > > > > Introduce BPF_SOCK_OPS_TS_SCHED_OPT_CB flag so that we can decide to > > print timestamps when the skb just passes the dev layer. > > > > Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx> > > --- > > include/uapi/linux/bpf.h | 5 +++++ > > net/core/skbuff.c | 17 +++++++++++++++-- > > tools/include/uapi/linux/bpf.h | 5 +++++ > > 3 files changed, 25 insertions(+), 2 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index 157e139ed6fc..3cf3c9c896c7 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -7019,6 +7019,11 @@ enum { > > * by the kernel or the > > * earlier bpf-progs. > > */ > > + BPF_SOCK_OPS_TS_SCHED_OPT_CB, /* Called when skb is passing through > > + * dev layer when SO_TIMESTAMPING > > + * feature is on. It indicates the > > + * recorded timestamp. > > + */ > > }; > > > > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > index 3a4110d0f983..16e7bdc1eacb 100644 > > --- a/net/core/skbuff.c > > +++ b/net/core/skbuff.c > > @@ -5632,8 +5632,21 @@ static void bpf_skb_tstamp_tx_output(struct sock *sk, int tstype) > > return; > > > > tp = tcp_sk(sk); > > - if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG)) > > - return; > > + if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG)) { > > + struct timespec64 tstamp; > > + u32 cb_flag; > > + > > + switch (tstype) { > > + case SCM_TSTAMP_SCHED: > > + cb_flag = BPF_SOCK_OPS_TS_SCHED_OPT_CB; > > + break; > > + default: > > + return; > > + } > > + > > + tstamp = ktime_to_timespec64(ktime_get_real()); > > + tcp_call_bpf_2arg(sk, cb_flag, tstamp.tv_sec, tstamp.tv_nsec); > > There is bpf_ktime_get_*() helper. The bpf prog can directly call the > bpf_ktime_get_* helper and use whatever clock it sees fit instead of enforcing > real clock here and doing an extra ktime_to_timespec64. Right now the > bpf_ktime_get_*() does not have real clock which I think it can be added. In this way, there is no need to add tcp_call_bpf_*arg() to pass timestamp to userspace, right? Let the bpf program implement it. Now I wonder what information I should pass? Sorry for the lack of BPF related knowledge :( > > I think overall the tstamp reporting interface does not necessarily have to > follow the socket API. The bpf prog is running in the kernel. It could pass > other information to the bpf prog if it sees fit. e.g. the bpf prog could also > get the original transmitted tcp skb if it is useful. Good to know that! But how the BPF program parses the skb by using tcp_call_bpf_2arg() which only passes u32 parameters. Thanks, Jason