From: Jason Xing <kernelxing@xxxxxxxxxxx> introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter out rx software timestamp report, especially after a process turns on netstamp_needed_key which can time stamp every incoming skb. Previously, we found out if an application starts first which turns on netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE could also get rx timestamp. Now we handle this case by introducing this new flag without breaking users. Quoting Willem to explain why we need the flag: "why a process would want to request software timestamp reporting, but not receive software timestamp generation. The only use I see is when the application does request SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE." In this way, we have two kinds of combination: 1. setting SOF_TIMESTAMPING_SOFTWARE|SOF_TIMESTAMPING_RX_SOFTWARE, it will surely allow users to get the rx software timestamp report. 2. setting SOF_TIMESTAMPING_SOFTWARE|SOF_TIMESTAMPING_OPT_RX_FILTER while the skb is timestamped, it will stop reporting the rx software timestamp. Another thing about errqueue in this patch I have a few words to say: In this case, we need to handle the egress path carefully, or else reporting the tx timestamp will fail. Egress path and ingress path will finally call sock_recv_timestamp(). We have to distinguish them. Errqueue is a good indicator to reflect the flow direction. Suggested-by: Willem de Bruijn <willemb@xxxxxxxxxx> Reviewed-by: Willem de Bruijn <willemb@xxxxxxxxxx> Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx> --- v4 Link: https://lore.kernel.org/all/20240830153751.86895-2-kerneljasonxing@xxxxxxxxx/ 1. revise the commit message and doc (Willem) 2. simplify the test statement (Jakub) 3. add Willem's reviewed-by tag (Willem) v3 1. Willem suggested this alternative way to solve the issue, so I added his Suggested-by tag here. Thanks! --- Documentation/networking/timestamping.rst | 12 ++++++++++++ include/uapi/linux/net_tstamp.h | 3 ++- net/core/sock.c | 4 ++++ net/ethtool/common.c | 1 + net/ipv4/tcp.c | 6 ++++-- net/socket.c | 4 +++- 6 files changed, 26 insertions(+), 4 deletions(-) diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst index 5e93cd71f99f..37ead02be3b1 100644 --- a/Documentation/networking/timestamping.rst +++ b/Documentation/networking/timestamping.rst @@ -266,6 +266,18 @@ SOF_TIMESTAMPING_OPT_TX_SWHW: two separate messages will be looped to the socket's error queue, each containing just one timestamp. +SOF_TIMESTAMPING_OPT_RX_FILTER: + Used in the receive software timestamp. Enabling the flag along with + SOF_TIMESTAMPING_SOFTWARE will not report the rx timestamp to the + userspace so that it can filter out the case where one process starts + first which turns on netstamp_needed_key through setting generation + flags like SOF_TIMESTAMPING_RX_SOFTWARE, then another one only passing + SOF_TIMESTAMPING_SOFTWARE report flag could also get the rx timestamp. + + SOF_TIMESTAMPING_OPT_RX_FILTER prevents the application from being + influenced by others and let the application choose whether to report + the timestamp in the receive path or not. + New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate regardless of the setting of sysctl net.core.tstamp_allow_data. diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h index a2c66b3d7f0f..858339d1c1c4 100644 --- a/include/uapi/linux/net_tstamp.h +++ b/include/uapi/linux/net_tstamp.h @@ -32,8 +32,9 @@ enum { SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14), SOF_TIMESTAMPING_BIND_PHC = (1 << 15), SOF_TIMESTAMPING_OPT_ID_TCP = (1 << 16), + SOF_TIMESTAMPING_OPT_RX_FILTER = (1 << 17), - SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_ID_TCP, + SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_RX_FILTER, SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) | SOF_TIMESTAMPING_LAST }; diff --git a/net/core/sock.c b/net/core/sock.c index 468b1239606c..6a93344e21cf 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -908,6 +908,10 @@ int sock_set_timestamping(struct sock *sk, int optname, !(val & SOF_TIMESTAMPING_OPT_ID)) return -EINVAL; + if (val & SOF_TIMESTAMPING_RX_SOFTWARE && + val & SOF_TIMESTAMPING_OPT_RX_FILTER) + return -EINVAL; + if (val & SOF_TIMESTAMPING_OPT_ID && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID)) { if (sk_is_tcp(sk)) { diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 781834ef57c3..6c245e59bbc1 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -427,6 +427,7 @@ const char sof_timestamping_names[][ETH_GSTRING_LEN] = { [const_ilog2(SOF_TIMESTAMPING_OPT_TX_SWHW)] = "option-tx-swhw", [const_ilog2(SOF_TIMESTAMPING_BIND_PHC)] = "bind-phc", [const_ilog2(SOF_TIMESTAMPING_OPT_ID_TCP)] = "option-id-tcp", + [const_ilog2(SOF_TIMESTAMPING_OPT_RX_FILTER)] = "option-rx-filter", }; static_assert(ARRAY_SIZE(sof_timestamping_names) == __SOF_TIMESTAMPING_CNT); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8a5680b4e786..a0c57c8b77bd 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2235,6 +2235,7 @@ void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk, struct scm_timestamping_internal *tss) { int new_tstamp = sock_flag(sk, SOCK_TSTAMP_NEW); + u32 tsflags = READ_ONCE(sk->sk_tsflags); bool has_timestamping = false; if (tss->ts[0].tv_sec || tss->ts[0].tv_nsec) { @@ -2274,14 +2275,15 @@ void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk, } } - if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_SOFTWARE) + if (tsflags & SOF_TIMESTAMPING_SOFTWARE && + !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER)) has_timestamping = true; else tss->ts[0] = (struct timespec64) {0}; } if (tss->ts[2].tv_sec || tss->ts[2].tv_nsec) { - if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_RAW_HARDWARE) + if (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) has_timestamping = true; else tss->ts[2] = (struct timespec64) {0}; diff --git a/net/socket.c b/net/socket.c index fcbdd5bc47ac..f8609d649ed3 100644 --- a/net/socket.c +++ b/net/socket.c @@ -946,7 +946,9 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk, memset(&tss, 0, sizeof(tss)); tsflags = READ_ONCE(sk->sk_tsflags); - if ((tsflags & SOF_TIMESTAMPING_SOFTWARE) && + if ((tsflags & SOF_TIMESTAMPING_SOFTWARE && + (skb_is_err_queue(skb) || + !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) && ktime_to_timespec64_cond(skb->tstamp, tss.ts + 0)) empty = 0; if (shhwtstamps && -- 2.37.3