On Tue, Oct 15, 2024 at 9:28 AM Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote: > > Jason Xing wrote: > > On Sun, Oct 13, 2024 at 1:48 AM Willem de Bruijn > > <willemdebruijn.kernel@xxxxxxxxx> wrote: > > > > > > Jason Xing wrote: > > > > From: Jason Xing <kernelxing@xxxxxxxxxxx> > > > > > > > > A few weeks ago, I planned to extend SO_TIMESTMAMPING feature by using > > > > tracepoint to print information (say, tstamp) so that we can > > > > transparently equip applications with this feature and require no > > > > modification in user side. > > > > > > > > Later, we discussed at netconf and agreed that we can use bpf for better > > > > extension, which is mainly suggested by John Fastabend and Willem de > > > > Bruijn. Many thanks here! So I post this series to see if we have a > > > > better solution to extend. My feeling is BPF is a good place to provide > > > > a way to add timestamping by administrators, without having to rebuild > > > > applications. > > > > > > > > This approach mostly relies on existing SO_TIMESTAMPING feature, users > > > > only needs to pass certain flags through bpf_setsocktop() to a separate > > > > tsflags. For TX timestamps, they will be printed during generation > > > > phase. For RX timestamps, we will wait for the moment when recvmsg() is > > > > called. > > > > > > > > After this series, we could step by step implement more advanced > > > > functions/flags already in SO_TIMESTAMPING feature for bpf extension. > > > > > > > > In this series, I only support TCP protocol which is widely used in > > > > SO_TIMESTAMPING feature. > > > > > > > > --- > > > > V2 > > > > Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@xxxxxxxxx/ > > > > 1. Introduce tsflag requestors so that we are able to extend more in the > > > > future. Besides, it enables TX flags for bpf extension feature separately > > > > without breaking users. It is suggested by Vadim Fedorenko. > > > > 2. introduce a static key to control the whole feature. (Willem) > > > > 3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in > > > > some TX/RX cases, not all the cases. > > > > > > > > Note: > > > > The main concern we've discussion in V1 thread is how to deal with the > > > > applications using SO_TIMESTAMPING feature? In this series, I allow both > > > > cases to happen at the same time, which indicates that even one > > > > applications setting SO_TIMESTAMPING can still be traced through BPF > > > > program. Please see patch [04/12]. > > > > > > This revision does not address the main concern. > > > > > > An administrator installed BPF program can affect results of a process > > > using SO_TIMESTAMPING in ways that break it. > > > > Sorry, I didn't get it. How the following code snippet would break users? > > The state between user and bpf timestamping needs to be separate to > avoid interference. Do you agree that we will use this method as following, only allow either of them to work? void __skb_tstamp_tx(struct sk_buff *orig_skb, const struct sk_buff *ack_skb, struct skb_shared_hwtstamps *hwtstamps, struct sock *sk, int tstype) { if (!sk) return; ret = skb_tstamp_tx_output(orig_skb, ack_skb, hwtstamps, sk, tstype); if (ret) /* Apps does set the SO_TIMESTAMPING flag, return directly */ return; if (static_branch_unlikely(&bpf_tstamp_control)) bpf_skb_tstamp_tx_output(sk, orig_skb, tstype, hwtstamps); } which means if the apps using non-bpf method, we will not see the output even if we load bpf program. > > Introducing a new sk_tsflags for bpf goes a long way. Though I prefer > a separate sk_tsflags_bpf and not touching existing sk_tsflags over > the array approach of patch 1. Also need to check pahole and maybe > move sk_tsflags_bpf elsewhere in the struct. Yes, I will use this instead. > > Other state is sk_tskey. The current approach can initialize the key > in bpf before the user attempts it for the same socket. Admittedly > unlikely. But hard to reach states creates hard to debug issues. > > This field cannot easily be duplicated, because the key is tracked > in skb_shinfo. Where there is not sufficient room for two keys. > > The same goes for txflags. They are not that easy to handle in a proper way. That's the reason why I chose to use the same logic, so that there is no side effect. If we expect to separate them as well, it seems a little bit weird to introduce another similar flags in struct sk_buff. > > The current approach is to set those flags if either user or bpf > requestss them, then on __skb_tstamp_tx detect if the user did not set > them, and if so skip output to the user. Need to take a closer look, > but seems to work. Let me keep this current approach, it will not affect each other. > > So getting closer. Thanks for the careful review. Thanks, Jason