Hi Heiner, On Wed, Mar 11, 2020 at 10:22 PM Heiner Kallweit <hkallweit1@xxxxxxxxx> wrote: > On 11.03.2020 10:17, Geert Uytterhoeven wrote: > > On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote: > >> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote: > >>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@xxxxxxxxxxxxx> wrote: > >>>> From: Florian Fainelli <f.fainelli@xxxxxxxxx> > >>>> Date: Thu, 20 Feb 2020 15:34:53 -0800 > >>>> > >>>>> It is currently possible for a PHY device to be suspended as part of a > >>>>> network device driver's suspend call while it is still being attached to > >>>>> that net_device, either via phy_suspend() or implicitly via phy_stop(). > >>>>> > >>>>> Later on, when the MDIO bus controller get suspended, we would attempt > >>>>> to suspend again the PHY because it is still attached to a network > >>>>> device. > >>>>> > >>>>> This is both a waste of time and creates an opportunity for improper > >>>>> clock/power management bugs to creep in. > >>>>> > >>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") > >>>>> Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx> > >>>> > >>>> Applied, and queued up for -stable, thanks Florian. > >>> > >>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. > >>> After resume from s2ram, Ethernet no longer works: > >>> > >>> PM: suspend exit > >>> nfs: server aaa.bbb.ccc.ddd not responding, still trying > >>> ... > >>> > >>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") > >>> fixes the issue. > >>> > >>> On both boards, an SMSC LAN9220 is connected to a power-managed local > >>> bus. > >>> > >>> I added some debug code to check when the clock driving the local bus > >>> is stopped and started, but I see no difference before/after. Hence I > >>> suspect the Ethernet chip is no longer reinitialized after resume. > >> > >> Can you provide a complete log? > > > > With some debug info: > > > > SDHI0 Vcc: disabling > > PM: suspend entry (deep) > > Filesystems sync: 0.002 seconds > > Freezing user space processes ... (elapsed 0.001 seconds) done. > > OOM killer disabled. > > Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. > > PM: ==== a3sp/ee120000.sd: stop > > PM: ==== a3sp/ee100000.sd: stop > > smsc911x 8000000.ethernet: smsc911x_suspend:2577 > > smsc911x 8000000.ethernet: smsc911x_suspend:2579 running > > smsc911x 8000000.ethernet: smsc911x_suspend:2584 > > PM: ==== a3sp/ee200000.mmc: stop > > PM: ==== c4/fec10000.bus: stop > > PM: ==== a3sp/e6c40000.serial: stop > > PM: ==== c5/e61f0000.thermal: stop > > PM: ==== c4/e61c0200.interrupt-controller: stop > > PM: == a3sp: power off > > rmobile_pd_power_down: a3sp > > Disabling non-boot CPUs ... > > PM: ==== c4/e61c0200.interrupt-controller: start > > PM: ==== c5/e61f0000.thermal: start > > PM: ==== a3sp/e6c40000.serial: start > > PM: ==== c4/fec10000.bus: start > > PM: ==== a3sp/ee200000.mmc: start > > smsc911x 8000000.ethernet: smsc911x_resume:2606 > > smsc911x 8000000.ethernet: smsc911x_resume:2625 running > > PM: ==== a3sp/ee100000.sd: start > > OOM killer enabled. > > Restarting tasks ... done. > > PM: ==== a3sp/ee120000.sd: start > > PM: suspend exit > > nfs: server aaa.bbb.ccc.ddd not responding, still trying > > ... > > > > But no difference between the good and the bad case, except for the nfs > > failures. > > > >> Do you use the Generic PHY driver or a > >> specialized one? > > > > CONFIG_FIXED_PHY=y > > CONFIG_SMSC_PHY=y > > > > Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts > > > >> Do you have a way to dump the registers at the time of > >> failure and see if BMCR.PDOWN is still set somehow? > > > > Added a hook into "nfs: server not responding", which prints: > > > > MII_BMCR = 0x1900 > > > > i.e. BMCR_PDOWN = 0x0800 is still set. > > > >> Does the following help: > >> > >> diff --git a/drivers/net/ethernet/smsc/smsc911x.c > >> b/drivers/net/ethernet/smsc/smsc911x.c > >> index 49a6a9167af4..df17190c76c0 100644 > >> --- a/drivers/net/ethernet/smsc/smsc911x.c > >> +++ b/drivers/net/ethernet/smsc/smsc911x.c > >> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev) > >> if (netif_running(ndev)) { > >> netif_device_attach(ndev); > >> netif_start_queue(ndev); > >> + phy_resume(dev->phydev); > >> } > >> > > > > Yes i does, after s/dev->/ndev->/. > > Thanks! > > This seems to be a workaround. And the same issue we may have with I agree. > other drivers too. Could you please alternatively test the following? > It tackles the issue that mdio_bus_phy_may_suspend() is used in > suspend AND resume, and both calls may return different values. > > With this patch we call mdio_bus_phy_may_suspend() only when > suspending, and let the phy_device store whether it was suspended > by MDIO bus PM. Thanks, your patch fixes the issue, too. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds