On 15.01.2021 18:19, Badel, Laurent wrote: > Description: > External PHY reset from the FEC driver was introduced in commit [1] to > mitigate an issue with iMX SoCs and LAN87xx PHYs. The issue occurs > because the FEC driver turns off the reference clock for power saving > reasons [2], which doesn't work out well with LAN87xx PHYs which require > a running REF_CLK during the power-up sequence. As a result, the PHYs > occasionnally (and unpredictably) fail to establish a stable link and > require a hardware reset to work reliably. > > As previously noted [3] the solution in [1] integrates poorly with the > PHY abstraction layer, and it also performs many unnecessary resets. This > patch series suggests a simpler solution to this problem, namely to hold > the PHY in reset during the time between the PHY driver probe and the first > opening of the FEC driver. > > To illustrate why this is sufficient, below is a representation of the PHY > RST and REF_CLK status at relevant time points (note that RST signal is > active-low for LAN8710/20): > > 1. During system boot when the PHY is probed: > RST 111111111111111111111000001111111111111 > CLK 000011111111111111111111111111111000000 > REF_CLK is enabled during fec_probe(), and there is a short reset pulse > due to mdiobus_register_gpiod() which calls gpiod_get_optional() with > the GPIOD_OUT_LOW which sets the initial value to 0. The reset is > de-asserted by phy_device_register() shortly after. After that, the PHY > runs without clock until the FEC is opened, which causes issues. > > 2. At first opening of the FEC: > RST 111111111111111111111111111100000111111 > CLK 000000000011111111111111111111111111111 > After REF_CLK is enabled, phy_reset_after_clk_enable() causes a > short reset pulse. Reset is needed here because the PHY was running > without clock before. > > 3. At closing of the FEC driver: > RST 111110000000000000000000000000000000000 > CLK 111111111111000000000000000000000000000 > FEC first disconnects the PHY, which asserts the reset, and then > disables the clock. > > 4. At subsequent openings of the FEC: > RST 000000000000000011111111111110000011111 > CLK 000000000011111111111111111111111111111 > FEC first enables the clock, then connects to the PHY which releases > the reset. Here the second reset pulse (phy_reset_after_clk_enable()) > is unnecessary, because REF_CLK is already running when the reset is > first deasserted. > > This illustrates that the only place where the extra reset pulse is > actually needed, is at the first opening of the FEC driver, and the reason > it is needed in the first place, is because the PHY has been running > without clock after it was probed. > > Extensive testing with LAN8720 confirmed that the REF_CLK can be disabled > without problems as long as the PHY is either in reset or in power-down > mode (which is relevant for suspend-to-ram as well). Therefore, instead > of relying on extra calls to phy_reset_after_clk_enable(), the issue > addressed by commit [1] can be simply fixed by keeping the PHY in reset > when exiting from phy_probe(). In this way the PHY will always be in reset > or power-down whenever the REF_CLK is turned off. > > This should not cause issues, since as per the PAL documentation any > driver that has business with the PHY should at least call phy_attach(), > which will deassert the reset in due time. Therefore this fix probably > works equally well for any PHY, but out of caution the patch uses the > existing PHY_RST_AFTER_CLK_EN driver flag (which it renames), to implement > the fix only for LAN8710/20/40 PHYs. > > Previous versions: > This is the 3rd version of the series; below is a short description of > the previous versions. > > v1: > The solution in [1] has the unfortunate side-effect of breaking the PHY > interrupt system due to the hardware reset erasing the interrupt mask of > the PHY. Patch series v1 suggested performing the extra reset before the > PHY is configured, by moving the call to phy_reset_after_clk_enable() from > the FEC into phy_init_hw() instead. The patch was re-examinated after > finding an issue during resume from suspend, where the PHY also seemed to > require a hardware reset to work properly. > Further investigation showed that this is in fact due to another > peculiarity of the LAN87xx, which also erase their interrupt mask upon > software reset (which is done by phy_init_hw() on resuming from > suspend-to-ram), and is thus a separate issue that will be addressed in > a separate patch. > > v2: > During this time the kernel had moved on and 2 new commits rendered the v1 > fix unnecessary: > [3] allows the extra PHY reset to still be performed from the FEC, but > before the interrupt mask is configured, thereby fixing the above > interrupt mask issue. > [4] allows LAN87xx to take control of the REF_CLK directly, preventing > the FEC from disabling it and thus circumventing the entire REF_CLK > issue. > Patch v2 proposed to fix 2 potential issues with the solution from [4], > namely that (i) failing to set the PHY "clocks" DT property would silently > break the system (because FEC succeeds in disabling the REF_CLK, yet the > extra reset has been removed), and (ii) keeping the REF_CLK enabled > defeated the power-saving purpose of commit [2]. > > The present patch fixes (i), and leaves it up to the user to use the > power-friendly clock management of [2] (leave the DT clocks property > unset), or keep the REF_CLK always enabled (set the clocks property). > It also simplifies the code by removing all calls to > phy_reset_after_clk_enable() and related code, and the function > phy_reset_after_clk_enable() altogether. > > Tests: against net-next (5.11-rc3) with LAN8720 and LAN8742 and iMX283 > SoC. Unfortunately unable to test LAN8740 which has a different form > factor. > > References: > [1] commit 1b0a83ac04e3 ("net: fec: add phy_reset_after_clk_enable() > support") > [2] commit e8fcfcd5684a ("net: fec: optimize the clock management to save > power") > [3] commit 64a632da538a ("net: fec: Fix phy_device lookup for > phy_reset_after_clk_enable()") > [4] commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in > support") > > Laurent Badel (4): > Remove PHY reset in fec_main.c > Remove phy_reset_after_clk_enable() > Rename PHY_RST_AFTER_CLK_EN to PHY_RST_AFTER_PROBE > Add PHY reset after probe for PHYs with PHY_RST_AFTER_PROBE flag > > drivers/net/ethernet/freescale/fec_main.c | 40 ----------------------- > drivers/net/phy/phy_device.c | 26 +-------------- > drivers/net/phy/smsc.c | 4 +-- > include/linux/phy.h | 3 +- > 4 files changed, 4 insertions(+), 69 deletions(-) > W/o knowing the exact issue in detail: For the sake of bisectability, shouldn't patches 4 and a modified patch 3 be first? After patches 1 and 2 we may be in trouble, right? And it seems you composed the mails manually: - subject is prefixed with "Subject:" - patch subjects in the cover letter don't match the ones of the respective patches For the patch prefixes: Please add a space between net: and phy: