Luiz Augusto von Dentz wrote: > Hi Willem, > > On Mon, Feb 10, 2025 at 1:43 PM Willem de Bruijn > <willemdebruijn.kernel@xxxxxxxxx> wrote: > > > > Pauli Virtanen wrote: > > > Hi, > > > > > > su, 2025-02-09 kello 22:29 -0500, Willem de Bruijn kirjoitti: > > > > Pauli Virtanen wrote: > > > > > Add SOF_TIMESTAMPING_TX_COMPLETION, for requesting a software timestamp > > > > > when hardware reports a packet completed. > > > > > > > > > > Completion tstamp is useful for Bluetooth, where hardware tx timestamps > > > > > cannot be obtained except for ISO packets, and the hardware has a queue > > > > > where packets may wait. In this case the software SND timestamp only > > > > > reflects the kernel-side part of the total latency (usually small) and > > > > > queue length (usually 0 unless HW buffers congested), whereas the > > > > > completion report time is more informative of the true latency. > > > > > > > > > > It may also be useful in other cases where HW TX timestamps cannot be > > > > > obtained and user wants to estimate an upper bound to when the TX > > > > > probably happened. > > > > > > > > Getting the completion timestamp may indeed be useful more broadly. > > > > > > > > Alternatively, the HW timestamp is relatively imprecisely defined so > > > > you could even just use that. Ideally, a hw timestamp conforms to IEEE > > > > 1588v2 PHY: first symbol on the wire IIRC. But in many cases this is > > > > not the case. It is not feasible at line rate, or the timestamp is > > > > only taken when the completion is written over PCI, which may be > > > > subject to PCI backpressure and happen after transmission on the wire. > > > > As a result, the worst case hw tstamp must already be assumed not much > > > > earlier than a completion timestamp. > > > > > > For BT ISO packets, in theory hw-provided TX timestamps exist, and we > > > might want both (with separate flags for enabling them). I don't really > > > know, last I looked Intel HW didn't support them, and it's not clear to > > > which degree they are useful. > > > > That's reason enough to separate these measurement types. > > > > If we don't do it properly now, we won't be able to update drivers > > later once users depend on requesting hw timestamps when they mean to > > get completion timestamps. > > > > > > That said, +1 on adding explicit well defined measurement point > > > > instead. > > > > > > > > > Signed-off-by: Pauli Virtanen <pav@xxxxxx> > > > > > --- > > > > > Documentation/networking/timestamping.rst | 9 +++++++++ > > > > > include/linux/skbuff.h | 6 +++++- > > > > > include/uapi/linux/errqueue.h | 1 + > > > > > include/uapi/linux/net_tstamp.h | 6 ++++-- > > > > > net/ethtool/common.c | 1 + > > > > > net/socket.c | 3 +++ > > > > > 6 files changed, 23 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst > > > > > index 61ef9da10e28..de2afed7a516 100644 > > > > > --- a/Documentation/networking/timestamping.rst > > > > > +++ b/Documentation/networking/timestamping.rst > > > > > @@ -140,6 +140,15 @@ SOF_TIMESTAMPING_TX_ACK: > > > > > cumulative acknowledgment. The mechanism ignores SACK and FACK. > > > > > This flag can be enabled via both socket options and control messages. > > > > > > > > > > +SOF_TIMESTAMPING_TX_COMPLETION: > > > > > + Request tx timestamps on packet tx completion. The completion > > > > > + timestamp is generated by the kernel when it receives packet a > > > > > + completion report from the hardware. Hardware may report multiple > > > > > + packets at once, and completion timestamps reflect the timing of the > > > > > + report and not actual tx time. The completion timestamps are > > > > > + currently implemented only for: Bluetooth L2CAP and ISO. This > > > > > + flag can be enabled via both socket options and control messages. > > > > > + > > > > > > > > Either we should support this uniformly, or it should be possible to > > > > query whether a driver supports this. > > > > > > > > Unfortunately all completion callbacks are driver specific. > > > > > > > > But drivers that support hwtstamps will call skb_tstamp_tx with > > > > nonzero hwtstamps. We could use that also to compute and queue > > > > a completion timestamp if requested. At least for existing NIC > > > > drivers. > > > > > > Ok. If possible, I'd like to avoid changing the behavior of the non- > > > Bluetooth parts of net/ here, as I'm not familiar with those. > > > > > > I guess a simpler solution could be that sock_set_timestamping() checks > > > the type of the socket, and gives EINVAL if the flag is set for non- > > > Bluetooth sockets? > > > > Actually, I'd prefer to have this completion timestamp ability for all > > drivers. And avoid creating subsystem private mechanisms. > > > > I suppose we can punt on the get_ts_info control API if need be. > > I guess that it is reasonable if we don't have to do the work for > drivers other than Bluetooth otherwise I'd say you are probably asking > too much here, I was mainly agreeing to Pauli's implementation in this series. Not asking to do any work irrelevant to BT. > also doesn't this land on the TSN space if one needs to > tightly control timings? I suspect if this sort of change was not > necessary for TSN then perhaps it wouldn't be of much value to try to > generalize this. I don't fully follow. But in general SO_TIMESTAMPING is not limited to TSN. > > > > One could then postpone having to invent how to check the driver > > > support, and user would know non-supported status from setsockopt > > > failing. > > > > > > > > 1.3.2 Timestamp Reporting > > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > > > > > index bb2b751d274a..3707c9075ae9 100644 > > > > > --- a/include/linux/skbuff.h > > > > > +++ b/include/linux/skbuff.h > > > > > @@ -489,10 +489,14 @@ enum { > > > > > > > > > > /* generate software time stamp when entering packet scheduling */ > > > > > SKBTX_SCHED_TSTAMP = 1 << 6, > > > > > + > > > > > + /* generate software time stamp on packet tx completion */ > > > > > + SKBTX_COMPLETION_TSTAMP = 1 << 7, > > > > > }; > > > > > > > > > > #define SKBTX_ANY_SW_TSTAMP (SKBTX_SW_TSTAMP | \ > > > > > - SKBTX_SCHED_TSTAMP) > > > > > + SKBTX_SCHED_TSTAMP | \ > > > > > + SKBTX_COMPLETION_TSTAMP) > > > > > > > > These fields are used in the skb_shared_info tx_flags field. > > > > Which is a very scarce resource. This takes the last available bit. > > > > That is my only possible concern: the opportunity cost. > > > > > > If doing it per-protocol sounds ok, it could be put in bt_skb_cb > > > instead. > > > > > > Since the completion timestamp didn't already exist, it maybe means > > > it's probably not that important for other parts of net/ > > > > I can see its value especially for hardware that does not support > > hardware timestamps, or hw timestamps at line rate. > > > > This gives a reasonable estimation of transmission time and > > measure of device delay. > > > > It is device specific whether it will be an over- or under-estimation, > > depending on whether the completion is queued to the host after or > > before the data is written on the wire. But either way, it will > > include the delay in processing the tx queue, which on multi-queue > > NICs and with TSO may be substantial (even before considering HW > > rate limiting). > > > > > > > #define SKBTX_ANY_TSTAMP (SKBTX_HW_TSTAMP | \ > > > > > SKBTX_HW_TSTAMP_USE_CYCLES | \ > > > > > SKBTX_ANY_SW_TSTAMP) > > > > > diff --git a/include/uapi/linux/errqueue.h b/include/uapi/linux/errqueue.h > > > > > index 3c70e8ac14b8..1ea47309d772 100644 > > > > > --- a/include/uapi/linux/errqueue.h > > > > > +++ b/include/uapi/linux/errqueue.h > > > > > @@ -73,6 +73,7 @@ enum { > > > > > SCM_TSTAMP_SND, /* driver passed skb to NIC, or HW */ > > > > > SCM_TSTAMP_SCHED, /* data entered the packet scheduler */ > > > > > SCM_TSTAMP_ACK, /* data acknowledged by peer */ > > > > > + SCM_TSTAMP_COMPLETION, /* packet tx completion */ > > > > > }; > > > > > > > > > > #endif /* _UAPI_LINUX_ERRQUEUE_H */ > > > > > diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h > > > > > index 55b0ab51096c..383213de612a 100644 > > > > > --- a/include/uapi/linux/net_tstamp.h > > > > > +++ b/include/uapi/linux/net_tstamp.h > > > > > @@ -44,8 +44,9 @@ enum { > > > > > SOF_TIMESTAMPING_BIND_PHC = (1 << 15), > > > > > SOF_TIMESTAMPING_OPT_ID_TCP = (1 << 16), > > > > > SOF_TIMESTAMPING_OPT_RX_FILTER = (1 << 17), > > > > > + SOF_TIMESTAMPING_TX_COMPLETION = (1 << 18), > > > > > > > > > > - SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_RX_FILTER, > > > > > + SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_TX_COMPLETION, > > > > > SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) | > > > > > SOF_TIMESTAMPING_LAST > > > > > }; > > > > > @@ -58,7 +59,8 @@ enum { > > > > > #define SOF_TIMESTAMPING_TX_RECORD_MASK (SOF_TIMESTAMPING_TX_HARDWARE | \ > > > > > SOF_TIMESTAMPING_TX_SOFTWARE | \ > > > > > SOF_TIMESTAMPING_TX_SCHED | \ > > > > > - SOF_TIMESTAMPING_TX_ACK) > > > > > + SOF_TIMESTAMPING_TX_ACK | \ > > > > > + SOF_TIMESTAMPING_TX_COMPLETION) > > > > > > > > > > /** > > > > > * struct so_timestamping - SO_TIMESTAMPING parameter > > > > > diff --git a/net/ethtool/common.c b/net/ethtool/common.c > > > > > index 2bd77c94f9f1..75e3b756012e 100644 > > > > > --- a/net/ethtool/common.c > > > > > +++ b/net/ethtool/common.c > > > > > @@ -431,6 +431,7 @@ const char sof_timestamping_names[][ETH_GSTRING_LEN] = { > > > > > [const_ilog2(SOF_TIMESTAMPING_BIND_PHC)] = "bind-phc", > > > > > [const_ilog2(SOF_TIMESTAMPING_OPT_ID_TCP)] = "option-id-tcp", > > > > > [const_ilog2(SOF_TIMESTAMPING_OPT_RX_FILTER)] = "option-rx-filter", > > > > > + [const_ilog2(SOF_TIMESTAMPING_TX_COMPLETION)] = "completion-transmit", > > > > > > > > just "tx-completion"? > > > > > > Ok. > > > > > > -- > > > Pauli Virtanen > > > > > > > -- > Luiz Augusto von Dentz