On Sun, Feb 16, 2025 at 10:48 PM Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > On Sun, Feb 16, 2025 at 10:45 PM Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > > > On Sun, Feb 16, 2025 at 10:36 PM Willem de Bruijn > > <willemdebruijn.kernel@xxxxxxxxx> wrote: > > > > > > Jason Xing wrote: > > > > On Sun, Feb 16, 2025 at 6:58 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > > > > > > > > > On 2/15/25 2:23 PM, Jason Xing wrote: > > > > > > On Sun, Feb 16, 2025 at 2:08 AM Willem de Bruijn > > > > > > <willemdebruijn.kernel@xxxxxxxxx> wrote: > > > > > >> > > > > > >> Jason Xing wrote: > > > > > >>> On Sat, Feb 15, 2025 at 11:06 PM Willem de Bruijn > > > > > >>> <willemdebruijn.kernel@xxxxxxxxx> wrote: > > > > > >>>> > > > > > >>>> Jason Xing wrote: > > > > > >>>>> Support hw SCM_TSTAMP_SND case for bpf timestamping. > > > > > >>>>> > > > > > >>>>> Add a new sock_ops callback, BPF_SOCK_OPS_TS_HW_OPT_CB. This > > > > > >>>>> callback will occur at the same timestamping point as the user > > > > > >>>>> space's hardware SCM_TSTAMP_SND. The BPF program can use it to > > > > > >>>>> get the same SCM_TSTAMP_SND timestamp without modifying the > > > > > >>>>> user-space application. > > > > > >>>>> > > > > > >>>>> To avoid increasing the code complexity, replace SKBTX_HW_TSTAMP > > > > > >>>>> with SKBTX_HW_TSTAMP_NOBPF instead of changing numerous callers > > > > > >>>>> from driver side using SKBTX_HW_TSTAMP. The new definition of > > > > > >>>>> SKBTX_HW_TSTAMP means the combination tests of socket timestamping > > > > > >>>>> and bpf timestamping. After this patch, drivers can work under the > > > > > >>>>> bpf timestamping. > > > > > >>>>> > > > > > >>>>> Considering some drivers doesn't assign the skb with hardware > > > > > >>>>> timestamp, > > > > > >>>> > > > > > >>>> This is not for a real technical limitation, like the skb perhaps > > > > > >>>> being cloned or shared? > > > > > >>> > > > > > >>> Agreed on this point. I'm kind of familiar with I40E, so I dare to say > > > > > >>> the reason why it doesn't assign the hwtstamp is because the skb will > > > > > >>> soon be destroyed, that is to say, it's pointless to assign the > > > > > >>> timestamp. > > > > > >> > > > > > >> Makes sense. > > > > > >> > > > > > >> But that does not ensure that the skb is exclusively owned. Nor that > > > > > >> the same is true for all drivers using this API (which is not small, > > > > > >> but small enough to manually review if need be). > > > > > >> > > > > > >> The first two examples I happened to look at, i40e and bnx2x, both use > > > > > >> skb_get() to get a non-exclusive skb reference for their ptp_tx_skb. > > > > > > > > > > I think the existing __skb_tstamp_tx() function is also assigning to > > > > > skb_hwtstamps(skb). The skb may be cloned from the orig_skb first, but they > > > > > still share the same shinfo. My understanding is that this patch is assigning to > > > > > the shinfo earlier, so it should not have changed the driver's expectation on > > > > > the skb_hwtstamps(skb) after calling __skb_tstamp_tx(). If there are drivers > > > > > assuming exclusive access to the skb_hwtstamps(skb), probably it is something > > > > > that needs to be addressed regardless and should not be the common case? > > > > > > > > Right, it's also what I was trying to say but missed. Thanks for the > > > > supplementary info:) > > > > > > That existing behavior looks dodgy then, too. > > > > > > I don't have time to look into it deeply right now. But it seems to go > > > back all the way to the introduction of hw timestamping in commit > > > ac45f602ee3d in 2009. > > > > Right. And hardware timestamping has been used for many years, I presume. > > > > > > > > I can see how it works in that nothing else holding a clone will > > > likely have a reason to touch those fields. But that does not make it > > > correct. > > > > > > Your point that the new code is no worse than today probably is true. > > > > Right. > > > > > But when we spot something we prefer to fix it probably. Will need a > > > deeper look.. > > > > Got it. I added it to my to-do list. If you don't mind, I plan to take > > a deep look in March and then get back to you because recently I'm > > occupied by many things. I need to study some of the drivers that > > don't use skb_get() there. One more thing I'd like to mention here is if this is real bug, I think we should fix in the driver side since __skb_tstamp_tx() will always clone the skb when tsflag doesn't include SOF_TIMESTAMPING_OPT_TSONLY. We cannot force every application to add SOF_TIMESTAMPING_OPT_TSONLY option to prevent such an 'issue' happening. Let me set those special drivers as examples, if the skb can be destroyed in __skb_tstamp_tx() , it means the skb staying in the driver can be destroyed at any time before entering __skb_tstamp_tx(). Then we may conclude that the skb cannot be accessed safely as long as without skb_get() protection. Technically speaking, It's a driver side issue, not socket/bpf timestamping feature. The same resolution applies for bpf timestamping as well. So I feel this point would not block this series from being merged? My instinct tells me it can not be a problem because it was added in 2009. Sorry that I'm unable to persuade you through theory for now. Thanks, Jason