On Wed, Feb 5, 2025 at 11:54 PM Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote: > > Jason Xing wrote: > > Bpf prog calculates a couple of latency delta between each tx points > > which SO_TIMESTAMPING feature has already implemented. It can be used > > in the real world to diagnose the behaviour in the tx path. > > > > Also, check the safety issues by accessing a few bpf calls in > > bpf_test_access_bpf_calls(). > > > > Signed-off-by: Jason Xing <kerneljasonxing@xxxxxxxxx> > > > +static bool bpf_test_delay(struct bpf_sock_ops *skops, const struct sock *sk) > > +{ > > + struct bpf_sock_ops_kern *skops_kern; > > + u64 timestamp = bpf_ktime_get_ns(); > > + struct skb_shared_info *shinfo; > > + struct delay_info dinfo = {0}; > > + struct sk_tskey key = {0}; > > + struct delay_info *val; > > + struct sk_buff *skb; > > + struct sk_stg *stg; > > + u64 prior_ts, delay; > > + > > + if (bpf_test_access_bpf_calls(skops, sk)) > > + return false; > > + > > + skops_kern = bpf_cast_to_kern_ctx(skops); > > + skb = skops_kern->skb; > > + shinfo = bpf_core_cast(skb->head + skb->end, struct skb_shared_info); > > + key.tskey = shinfo->tskey; > > + if (!key.tskey) > > + return false; > > + > > + key.cookie = bpf_get_socket_cookie(skops); > > + if (!key.cookie) > > + return false; > > + > > + if (skops->op == BPF_SOCK_OPS_TS_SND_CB) { > > + stg = bpf_sk_storage_get(&sk_stg_map, (void *)sk, 0, 0); > > + if (!stg) > > + return false; > > + dinfo.sendmsg_ns = stg->sendmsg_ns; > > + bpf_map_update_elem(&time_map, &key, &dinfo, BPF_ANY); > > + goto out; > > + } > > + > > + val = bpf_map_lookup_elem(&time_map, &key); > > + if (!val) > > + return false; > > + > > + switch (skops->op) { > > + case BPF_SOCK_OPS_TS_SCHED_OPT_CB: > > + delay = val->sched_delay = timestamp - val->sendmsg_ns; > > + break; > > For a test this is fine. But just a reminder that in general a packet > may pass through multiple qdiscs. For instance with bonding or tunnel > virtual devices in the egress path. Right, I've seen this in production (two times qdisc timestamps because of bonding). > > > + case BPF_SOCK_OPS_TS_SW_OPT_CB: > > + prior_ts = val->sched_delay + val->sendmsg_ns; > > + delay = val->sw_snd_delay = timestamp - prior_ts; > > + break; > > + case BPF_SOCK_OPS_TS_ACK_OPT_CB: > > + prior_ts = val->sw_snd_delay + val->sched_delay + val->sendmsg_ns; > > + delay = val->ack_delay = timestamp - prior_ts; > > + break; > > Similar to the above: fine for a test, but in practice be aware that > packets may be resent, in which case an ACK might precede a repeat > SCHED and SND. And erroneous or malicious peers may also just never > send an ACK. So this can never be relied on in production settings, > e.g., as the only signal to clear an entry from a map (as done in the > branch below). Agreed. In production, actually what we do is print all the timestamps and let an agent parse them. > > > + } > > + > > + if (delay >= delay_tolerance_nsec) > > + return false; > > + > > + /* Since it's the last one, remove from the map after latency check */ > > + if (skops->op == BPF_SOCK_OPS_TS_ACK_OPT_CB) > > + bpf_map_delete_elem(&time_map, &key); > > + > > +out: > > + return true; > > +} > > +