Re: [PATCH v3 3/5] net: Let the active time stamping layer be selectable.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+ Horatiu]

Am 2023-03-10 12:35, schrieb Vladimir Oltean:
On Fri, Mar 10, 2023 at 11:48:52AM +0100, Köry Maincent wrote:
> From previous discussions, I believe that a device tree property was
> added in order to prevent perceived performance regressions when
> timestamping support is added to a PHY driver, correct?

Yes, i.e. to select the default and better timestamp on a board.

Is there a way to unambiguously determine the "better" timestamping on a board?

Is it plausible that over time, when PTP timestamping matures and,
for example, MDIO devices get support for PTP_SYS_OFFSET_EXTENDED
(an attempt was here: https://lkml.org/lkml/2019/8/16/638), the
relationship between PTP clock qualities changes, and so does the
preference change?

> I have a dumb question: if updating the device trees is needed in order
> to prevent these behavior changes, then how is the regression problem
> addressed for those device trees which don't contain this new property
> (all device trees)?

On that case there is not really solution,

If it's not really a solution, then doesn't this fail at its primary
purpose of preventing regressions?

but be aware that CONFIG_PHY_TIMESTAMPING need to be activated to
allow timestamping on the PHY. Currently in mainline only few (3)
defconfig have it enabled so it is really not spread,

Do distribution kernels use the defconfigs from the kernel, or do they
just enable as many options that sound good as possible?

maybe I could add more documentation to prevent further regression
issue when adding support of timestamp to a PHY driver.

My opinion is that either the problem was not correctly identified,
or the proposed solution does not address that problem.

What I believe is the problem is that adding support for PHY timestamping
to a PHY driver will cause a behavior change for existing systems which
are deployed with that PHY.

If I had a multi-port NIC where all ports share the same PHC, I would
want to create a boundary clock with it. I can do that just fine when
using MAC timestamping. But assume someone adds support for PHY
timestamping and the kernel switches to using PHY timestamps by default.
Now I need to keep in sync the PHCs of the PHYs, something which was
implicit before (all ports shared the same PHC). I have done nothing
incorrectly, yet my deployment doesn't work anymore. This is just an
example. It doesn't sound like a good idea in general for new features
to cause a behavior change by default.

Having identified that as the problem, I guess the solution should be
to stop doing that (and even though a PHY driver supports timestamping,
keep using the MAC timestamping by default).

There is a slight inconvenience caused by the fact that there are
already PHY drivers using PHY timestamping, and those may have been
introduced into deployments with PHY timestamping. We cannot change the
default behavior for those either. There are 5 such PHY drivers today
(I've grepped for mii_timestamper in drivers/net/phy).

I would suggest that the kernel implements a short whitelist of 5
entries containing PHY driver names, which are compared against
netdev->phydev->drv->name (with the appropriate NULL pointer checks).
Matches will default to PHY timestamping. Otherwise, the new default
will be to keep the behavior as if PHY timestamping doesn't exist
(MAC still provides the timestamps), and the user needs to select the
PHY as the timestamping source explicitly.

Thoughts?

While I agree in principle (I have suggested to make MAC timestamping
the default before), I see a problem with the recent LAN8814 PHY
timestamping support, which will likely be released with 6.3. That
would now switch the timestamping to PHY timestamping for our board
(arch/arm/boot/dts/lan966x-kontron-kswitch-d10-mmt-8g.dts). I could
argue that is a regression for our board iff NETWORK_PHY_TIMESTAMPING
is enabled. Honestly, I don't know how to proceed here and haven't
tried to replicate the regression due to limited time. Assuming,
that I can show it is a regression, what would be the solution then,
reverting the commit? Horatiu, any ideas?

I digress from the original problem a bit. But if there would be such
a whitelist, I'd propose that it won't contain the lan8814 driver.

Other than that, I guess I have to put some time into testing
before it's too late.

-michael



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux