On Mon, Jul 20, 2020 at 02:45:03PM -0700, Jacob Keller wrote: > > > On 7/20/2020 2:05 PM, Vladimir Oltean wrote: > > On Mon, Jul 20, 2020 at 11:54:30AM -0700, Jacob Keller wrote: > >> On 7/18/2020 4:35 AM, Vladimir Oltean wrote: > >>> On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote: > >>>> On 7/17/2020 9:10 AM, Vladimir Oltean wrote: > >>>>> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE`` > >>>>> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``. > >>>>> +Originally, the network stack could deliver either a hardware or a software > >>>>> +time stamp, but not both. This flag prevents software timestamp delivery. > >>>>> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW`` > >>>>> +option, but still the original behavior is preserved as the default. > >>>>> + > >>>> > >>>> So, this implies that we set this only if both are supported? I thought > >>>> the intention was to set this flag whenever we start a HW timestamp. > >>>> > >>> > >>> It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it > >>> seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good > >>> practice, but there are many situations where it can do more harm than > >>> good. > >>> > >> > >> I guess I've only ever implemented a driver with software timestamping > >> enabled as an option. What sort of issues arise when you have this set? > >> I'm guessing that it's some configuration of stacked devices as in the > >> other cases? If the issue can't be fixed I'd at least like more > >> explanation here, since the prevailing convention is that we set this > >> flag, so understanding when and why it's problematic would be useful. > >> > >> Thanks, > >> Jake > > > > Yes, the problematic cases have to do with stacked PHCs (DSA, PHY). The > > pattern is that: > > - DSA sets SKBTX_IN_PROGRESS > > - calls dev_queue_xmit towards the MAC driver > > - MAC driver sees SKBTX_IN_PROGRESS, thinks it's the one who set it > > - MAC driver delivers TX timestamp > > - DSA ends poll or receives TX interrupt, collects its timestamp, and > > delivers a second TX timestamp > > In fact this is explained in a bit more detail in the current > > timestamping.rst file. > > Not only are there existing in-tree drivers that do that (and various > > subtle variations of it), but new code also has this tendency to take > > shortcuts and interpret any SKBTX_IN_PROGRESS flag set as being set > > locally. Good thing it's caught during review most of the time these > > days. It's an error-prone design. > > On the DSA front, 1 driver sets this flag (sja1105) and 3 don't (felix, > > mv88e6xxx, hellcreek). The driver who had trouble because of this flag? > > sja1105. > > On the PHY front, 2 drivers set this flag (mscc_phy, dp83640) and 1 > > doesn't (ptp_ines). The driver who had trouble? dp83640. > > So it's very far from obvious that setting this flag is 'the prevailing > > convention'. For a MAC driver, that might well be, but for DSA/PHY, > > there seem to be risks associated with doing that, and driver writers > > should know what they're signing up for. > > > > Perhaps the issue is that the MAC driver using SKBTX_IN_PROGRESS as the > mechanism for telling if it should deliver a timestamp. Shouldn't they > be relying on SKBTX_HW_TSTAMP for the "please timestamp" notification, > and then using their own mechanism for forwarding that timestamp once > it's complete? > > I see a handful of drivers do rely on checking this, but I think that's > the real bug here. > > > -Vladimir > > Yes, indeed, a lot of them are exclusively checking "skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS", without any further verification that they have hardware timestamping enabled in the first place, a lot more than I remembered. Some of the occurrences are actually new. I think at least part of the reason why this keeps going on is that there aren't any hard and fast rules that say you shouldn't do it. When there isn't even a convincing percentage of DSA/PHY drivers that do set SKBTX_HW_TSTAMP, the chances are pretty low that you'll get a stacked PHC driver that sets the flag, plus a MAC driver that checks for it incorrectly. So people tend to ignore this case. Even though, if stacked DSA drivers started supporting software TX timestamping (which is not unlikely, given the fact that this would also give you compatibility with PHY timestamping), I'm sure things would change, because more people would become aware of the issue once mv88e6xxx starts getting affected. What I've been trying to do is at least try to get people (especially people who have a lot of XP with 1588 drivers) to agree on a common set of guidelines that are explicitly written down. I think that's step #1. -Vladimir