Re: [PATCH net-next 0/9] net-timestamp: bpf extension to equip applications transparently

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 9, 2024 at 9:16 PM Vadim Fedorenko
<vadim.fedorenko@xxxxxxxxx> wrote:
>
> On 09/10/2024 12:48, Jason Xing wrote:
> > On Wed, Oct 9, 2024 at 7:12 PM Jason Xing <kerneljasonxing@xxxxxxxxx> wrote:
> >>
> >> On Wed, Oct 9, 2024 at 5:28 PM Vadim Fedorenko
> >> <vadim.fedorenko@xxxxxxxxx> wrote:
> >>>
> >>> On 09/10/2024 02:05, Jason Xing wrote:
> >>>> On Wed, Oct 9, 2024 at 7:22 AM Jason Xing <kerneljasonxing@xxxxxxxxx> wrote:
> >>>>>
> >>>>> On Wed, Oct 9, 2024 at 2:44 AM Willem de Bruijn
> >>>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
> >>>>>>
> >>>>>> Jason Xing wrote:
> >>>>>>> From: Jason Xing <kernelxing@xxxxxxxxxxx>
> >>>>>>>
> >>>>>>> A few weeks ago, I planned to extend SO_TIMESTMAMPING feature by using
> >>>>>>> tracepoint to print information (say, tstamp) so that we can
> >>>>>>> transparently equip applications with this feature and require no
> >>>>>>> modification in user side.
> >>>>>>>
> >>>>>>> Later, we discussed at netconf and agreed that we can use bpf for better
> >>>>>>> extension, which is mainly suggested by John Fastabend and Willem de
> >>>>>>> Bruijn. Many thanks here! So I post this series to see if we have a
> >>>>>>> better solution to extend.
> >>>>>>>
> >>>>>>> This approach relies on existing SO_TIMESTAMPING feature, for tx path,
> >>>>>>> users only needs to pass certain flags through bpf program to make sure
> >>>>>>> the last skb from each sendmsg() has timestamp related controlled flag.
> >>>>>>> For rx path, we have to use bpf_setsockopt() to set the sk->sk_tsflags
> >>>>>>> and wait for the moment when recvmsg() is called.
> >>>>>>
> >>>>>> As you mention, overall I am very supportive of having a way to add
> >>>>>> timestamping by adminstrators, without having to rebuild applications.
> >>>>>> BPF hooks seem to be the right place for this.
> >>>>>>
> >>>>>> There is existing kprobe/kretprobe/kfunc support. Supporting
> >>>>>> SO_TIMESTAMPING directly may be useful due to its targeted feature
> >>>>>> set, and correlation between measurements for the same data in the
> >>>>>> stream.
> >>>>>>
> >>>>>>> After this series, we could step by step implement more advanced
> >>>>>>> functions/flags already in SO_TIMESTAMPING feature for bpf extension.
> >>>>>>
> >>>>>> My main implementation concern is where this API overlaps with the
> >>>>>> existing user API, and how they might conflict. A few questions in the
> >>>>>> patches.
> >>>>>
> >>>>> Agreed. That's also what I'm concerned about. So I decided to ask for
> >>>>> related experts' help.
> >>>>>
> >>>>> How to deal with it without interfering with the existing apps in the
> >>>>> right way is the key problem.
> >>>>
> >>>> What I try to implement is let the bpf program have the highest
> >>>> precedence. It's similar to RTO min, see the commit as an example:
> >>>>
> >>>> commit f086edef71be7174a16c1ed67ac65a085cda28b1
> >>>> Author: Kevin Yang <yyd@xxxxxxxxxx>
> >>>> Date:   Mon Jun 3 21:30:54 2024 +0000
> >>>>
> >>>>       tcp: add sysctl_tcp_rto_min_us
> >>>>
> >>>>       Adding a sysctl knob to allow user to specify a default
> >>>>       rto_min at socket init time, other than using the hard
> >>>>       coded 200ms default rto_min.
> >>>>
> >>>>       Note that the rto_min route option has the highest precedence
> >>>>       for configuring this setting, followed by the TCP_BPF_RTO_MIN
> >>>>       socket option, followed by the tcp_rto_min_us sysctl.
> >>>>
> >>>> It includes three cases, 1) route option, 2) bpf option, 3) sysctl.
> >>>> The first priority can override others. It doesn't have a good
> >>>> chance/point to restore the icsk_rto_min field if users want to
> >>>> shutdown the bpf program because it is set in
> >>>> bpf_sol_tcp_setsockopt().
> >>>
> >>> rto_min example is slightly different. With tcp_rto_min the doesn't
> >>> expect any data to come back to user space while for timestamping the
> >>> app may be confused directly by providing more data, or by not providing
> >>> expected data. I believe some hint about requestor of the data is needed
> >>> here. It will also help to solve the problem of populating sk_err_queue
> >>> mentioned by Martin.
> >>
> >> Sorry, I don't fully get it. In this patch series, this bpf extension
> >> feature will not rely on sk_err_queue any more to report tx timestamps
> >> to userspace. Bpf program can do that printing.
> >>
> >> Do you mean that it could be wrong if one skb carries the tsflags that
> >> are previously set due to the bpf program and then suddenly users
> >> detach the program? It indeed will put a new/cloned skb into the error
> >> queue. Interesting corner case. It seems I have to re-implement a
> >> totally independent tsflags for bpf extension feature. Do you have a
> >> better idea on this?
> >
> > I feel that if I could introduce bpf new flags like
> > SOF_TIMESTAMPING_TX_ACK_BPF for the last skb based on this patch
> > series, then it will not populate skb in sk_err_queue even users
> > remove the bpf program all of sudden. With this kind of specific bpf
> > flags, we can also avoid conflicting with the apps using
> > SO_TIEMSTAMPING feature. Let me give it a shot unless a better
> > solution shows up.
>
> It doesn't look great to have duplicate flags just to indicate that this
> particular timestamp was asked by a bpf program, even though it looks

Or introduce a new field in struct sock or struct sk_buff so that
existing SOF_TIMESTAMPING_* can be reused.

> like a straight forward solution. Sounds like we have to re-think the
> interface for timestamping requests, but I don't have proper suggestion
> right now.

Thanks for your help :)

Thanks,
Jason





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux