Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 14/02/2025 11:21, Russell King (Oracle) wrote:
On Fri, Feb 14, 2025 at 10:58:55AM +0000, Jon Hunter wrote:
Thanks for the feedback. So ...

1. I can confirm that suspend works if I disable EEE via ethtool
2. Prior to this change I do see phy_eee_rx_clock_stop being called
    to enable the clock resuming from suspend, but after this change
    it is not.

Prior to this change I see (note the prints around 389-392 are when
we resume from suspend) ...

[    4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0

This is a bug in phylink - it shouldn't have been calling
phy_eee_rx_clock_stop() where a MAC doesn't support phylink managed EEE.

[    4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[    7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1

Presumably, this is when the link comes up before suspend, so the PHY
has been configured to allow the RX clock to be stopped prior to suspend

[  389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[  392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1

Presumably, as this is after resume, this is again when the link comes
up (that's the only time that stmmac calls phy_eee_rx_clock_stop().)

After this change I see ...

[    4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
[    4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[  191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode

To me, this looks no different - the PHY was configured for clock stop
before suspending in both cases.

However, something else to verify with the old code - after boot and the
link comes up (so you get the second phy_eee_rx_clock_stop() at 7s),
try unplugging the link and re-plugging it. Then try suspending.

I still need to try this but I am still not back to the office to get to this.
 > The point of this test is to verify whether the PHY ignores changes to
the RX clock stop configuration while the link is up.



The next stage is to instrument dwmac4_set_eee_mode(),
dwmac4_reset_eee_mode() and dwmac4_set_eee_lpi_entry_timer() to print
the final register values in each function vs dwmac4_set_lpi_mode() in
the new code. Also, I think instrumenting stmmac_common_interrupt() to
print a message when we get either CORE_IRQ_TX_PATH_IN_LPI_MODE or
CORE_IRQ_TX_PATH_EXIT_LPI_MODE indicating a change in LPI state would
be a good idea.

I'd like to see how this all ties up with suspend, resume, link up
and down events, so please don't trim the log so much.

I have been testing on top of v6.14-rc2 which does not have
dwmac4_set_lpi_mode(). However, instrumenting the other functions,
for a bad case I see ...

[  477.494226] PM: suspend entry (deep)
[  477.501869] Filesystems sync: 0.006 seconds
[  477.504518] Freezing user space processes
[  477.509067] Freezing user space processes completed (elapsed 0.001 seconds)
[  477.514770] OOM killer disabled.
[  477.517940] Freezing remaining freezable tasks
[  477.523449] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  477.566870] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[  477.586423] dwc-eth-dwmac 2490000.ethernet eth0: disable EEE
[  477.592052] dwmac4_set_eee_lpi_entry_timer: entered
[  477.596997] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0x0
[  477.680193] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
[  477.723771] Disabling non-boot CPUs ...
[  477.726898] psci: CPU5 killed (polled 0 ms)
[  477.731364] psci: CPU4 killed (polled 0 ms)
[  477.735439] psci: CPU3 killed (polled 0 ms)
[  477.740198] psci: CPU2 killed (polled 0 ms)
[  477.744220] psci: CPU1 killed (polled 0 ms)
[  477.748546] Enabling non-boot CPUs ...
[  477.751996] Detected PIPT I-cache on CPU1
[  477.754800] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU1: 0x0000009444c004
[  477.766211] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU1: 0x00000010305116
[  477.778379] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU1: 0x00000003001066
[  477.790231] CPU1: Booted secondary processor 0x0000000000 [0x4e0f0030]
[  477.797726] CPU1 is up
[  477.799388] Detected PIPT I-cache on CPU2
[  477.802952] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU2: 0x0000009444c004
[  477.814415] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU2: 0x00000010305116
[  477.826537] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU2: 0x00000003001066
[  477.838440] CPU2: Booted secondary processor 0x0000000001 [0x4e0f0030]
[  477.845865] CPU2 is up
[  477.847325] Detected PIPT I-cache on CPU3
[  477.851136] CPU3: Booted secondary processor 0x0000000101 [0x411fd073]
[  477.857958] CPU3 is up
[  477.860108] Detected PIPT I-cache on CPU4
[  477.863949] CPU4: Booted secondary processor 0x0000000102 [0x411fd073]
[  477.870714] CPU4 is up
[  477.872933] Detected PIPT I-cache on CPU5
[  477.876778] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
[  477.883556] CPU5 is up
[  477.985628] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[  477.993771] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
[  478.171396] dwmac4: Master AXI performs any burst length
[  478.174480] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[  478.181934] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[  478.202977] dwmac4_set_eee_lpi_entry_timer: entered
[  478.207918] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
[  478.287646] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
[  478.295538] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[  478.372819] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[  478.382118] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[  478.410458] OOM killer enabled.
[  478.411350] Restarting tasks ... done.
[  478.415459] VDDIO_SDMMC3_AP: voltage operation not allowed
[  478.417763] random: crng reseeded on system resumption
[  478.425747] PM: suspend exit
[  478.474698] VDDIO_SDMMC3_AP: voltage operation not allowed
[  478.533428] VDDIO_SDMMC3_AP: voltage operation not allowed
[  478.600368] VDDIO_SDMMC3_AP: voltage operation not allowed


For a good case I see ...

[   28.548472] PM: suspend entry (deep)
[   28.560503] Filesystems sync: 0.010 seconds
[   28.563622] Freezing user space processes
[   28.567838] Freezing user space processes completed (elapsed 0.001 seconds)
[   28.573380] OOM killer disabled.
[   28.576563] Freezing remaining freezable tasks
[   28.582100] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[   28.627180] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[   28.646770] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
[   28.690432] Disabling non-boot CPUs ...
[   28.693397] psci: CPU5 killed (polled 4 ms)
[   28.697401] psci: CPU4 killed (polled 4 ms)
[   28.702649] psci: CPU3 killed (polled 0 ms)
[   28.707369] psci: CPU2 killed (polled 0 ms)
[   28.711482] psci: CPU1 killed (polled 0 ms)
[   28.970011] Enabling non-boot CPUs ...
[   28.974751] Detected PIPT I-cache on CPU1
[   28.977635] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU1: 0x0000009444c004
[   28.989014] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU1: 0x00000010305116
[   29.001163] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU1: 0x00000003001066
[   29.013015] CPU1: Booted secondary processor 0x0000000000 [0x4e0f0030]
[   29.020526] CPU1 is up
[   29.022168] Detected PIPT I-cache on CPU2
[   29.025747] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU2: 0x0000009444c004
[   29.037182] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU2: 0x00000010305116
[   29.049328] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU2: 0x00000003001066
[   29.061196] CPU2: Booted secondary processor 0x0000000001 [0x4e0f0030]
[   29.068779] CPU2 is up
[   29.070095] Detected PIPT I-cache on CPU3
[   29.073916] CPU3: Booted secondary processor 0x0000000101 [0x411fd073]
[   29.080749] CPU3 is up
[   29.082898] Detected PIPT I-cache on CPU4
[   29.086729] CPU4: Booted secondary processor 0x0000000102 [0x411fd073]
[   29.093496] CPU4 is up
[   29.095715] Detected PIPT I-cache on CPU5
[   29.099556] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
[   29.106351] CPU5 is up
[   29.218549] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[   29.234190] dwmac4: Master AXI performs any burst length
[   29.237263] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[   29.244732] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[   29.267679] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[   29.270504] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[   29.306091] OOM killer enabled.
[   29.306981] Restarting tasks ... done.
[   29.311423] VDDIO_SDMMC3_AP: voltage operation not allowed
[   29.314095] random: crng reseeded on system resumption
[   29.321404] PM: suspend exit
[   29.370286] VDDIO_SDMMC3_AP: voltage operation not allowed
[   29.429655] VDDIO_SDMMC3_AP: voltage operation not allowed
[   29.496567] VDDIO_SDMMC3_AP: voltage operation not allowed
[   32.968855] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
[   32.974779] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_link_up: tx_lpi_timer 1000000
[   32.988755] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx


The more I have been testing, the more I feel that this is timing
related. In good cases, I see the MAC link coming up well after the
PHY. Even with your change I did see suspend work on occassion and
when it does I see ...

[   79.775977] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[   79.784196] dwmac4: Master AXI performs any burst length
[   79.787280] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[   79.794736] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[   79.816642] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[   79.820437] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[   79.854481] OOM killer enabled.
[   79.855372] Restarting tasks ... done.
[   79.859460] VDDIO_SDMMC3_AP: voltage operation not allowed
[   79.861297] random: crng reseeded on system resumption
[   79.869773] PM: suspend exit
[   79.914909] VDDIO_SDMMC3_AP: voltage operation not allowed
[   79.974322] VDDIO_SDMMC3_AP: voltage operation not allowed
[   80.041236] VDDIO_SDMMC3_AP: voltage operation not allowed
[   83.547730] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
[   83.566859] dwmac4_set_eee_lpi_entry_timer: entered
[   83.571782] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
[   83.651520] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
[   83.659425] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

On a good case, the stmmac_mac_enable_tx_lpi call always happens
much later. It seems that after this change it is more often
that the link is coming up sooner and I guess probably too soon.
May be we were getting lucky before?

Anyway, I made the following change for testing and this is
working ...

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b34ebb916b89..44187e230a1e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7906,16 +7906,6 @@ int stmmac_resume(struct device *dev)
                        return ret;
        }
- rtnl_lock();
-       if (device_may_wakeup(priv->device) && priv->plat->pmt) {
-               phylink_resume(priv->phylink);
-       } else {
-               phylink_resume(priv->phylink);
-               if (device_may_wakeup(priv->device))
-                       phylink_speed_up(priv->phylink);
-       }
-       rtnl_unlock();
-
        rtnl_lock();
        mutex_lock(&priv->lock);
@@ -7930,6 +7920,13 @@ int stmmac_resume(struct device *dev) stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw); + if (device_may_wakeup(priv->device) && priv->plat->pmt) {
+               phylink_resume(priv->phylink);
+       } else {
+               phylink_resume(priv->phylink);
+               if (device_may_wakeup(priv->device))
+                       phylink_speed_up(priv->phylink);
+       }
        stmmac_enable_all_queues(priv);
        stmmac_enable_all_dma_irq(priv);

I noticed that in __stmmac_open() the phylink_start() is
called after stmmac_hw_setup and stmmac_init_coalesce, where
as in stmmac_resume, phylink_resume() is called before these.
I am not saying that this is correct in any way, but seems
to indicate that the PHY is coming up too soon (at least for
this device). I have ran 100 suspend iterations with the above
and I have not seen any failures.

Let me know if you have any thoughts on this.

Thanks!
Jon

--
nvpublic





[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux