Re: [PATCH] [RFC] net: phy: Fix reboot crash if CONFIG_IP_PNP is not set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ioana,

On Mon, Jan 4, 2021 at 3:53 PM Ioana Ciornei <ioana.ciornei@xxxxxxx> wrote:
> On Mon, Jan 04, 2021 at 01:24:15PM +0100, Geert Uytterhoeven wrote:
> > Wolfram reports that his R-Car H2-based Lager board can no longer be
> > rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
> > The issue can be reproduced on other boards (e.g. Koelsch with R-Car
> > M2-W) too, if CONFIG_IP_PNP is disabled:
>
> What kind of PHYs are used on these boards?

Micrel KSZ8041RNLI

> >     Unhandled fault: imprecise external abort (0x1406) at 0x00000000
> >     pgd = (ptrval)
> >     [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
> >     Internal error: : 1406 [#1] ARM
> >     Modules linked in:
> >     CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
> >     Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
> >     PC is at sh_mdio_ctrl+0x44/0x60
> >     LR is at sh_mmd_ctrl+0x20/0x24
> >     ...
> >     Backtrace:
> >     [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
> >      r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
> >     [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
> >     [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
> >      r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
> >     [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
> >      r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
> >     [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
> >      r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
> >     [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
> >      r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
> >     [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
> >     [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
> >      r5:c229f800 r4:c229f800
> >     [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
> >      r5:c229f800 r4:c229f804
> >     [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
> >     [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
> >      r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
> >     [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
> >     [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
> >      r5:4321fedc r4:01234567
> >     [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
> >      r7:00000058 r6:00000000 r5:00000000 r4:00000000
> >     [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)
> >
> > Calling phy_disable_interrupts() unconditionally means that the PHY
> > registers may be accessed while the device is suspended, causing
> > undefined behavior, which may crash the system.
> >
> > Fix this by calling phy_disable_interrupts() only when the PHY has been
> > started.
> >
> > Reported-by: Wolfram Sang <wsa+renesas@xxxxxxxxxxxxxxxxxxxx>
> > Fixes: e2f016cf775129c0 ("net: phy: add a shutdown procedure")
> > Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
> > ---
> > Marked RFC as I do not know if this change breaks the use case fixed by
> > the faulty commit.
>
> I haven't tested it yet but most probably this change would partially
> revert the behavior to how things were before adding the shutdown
> procedure.
>
> And this is because the interrupts are enabled at phy_connect and not at
> phy_start so we would want to disable any PHY interrupts even though the
> PHY has not been started yet.

Makes sense.

> > Alternatively, the device may have to be started
> > explicitly first.
>
> Have you actually tried this out and it worked?

No, I haven't tested restarting the device first.
I would like to avoid starting the device during shutdown, unless it is
absolutely necessary.

> I am asking this because I would much rather expect this to be a problem
> with how the sh_eth driver behaves if the netdevice did not connect to
> the PHY (this is done in .open() alongside the phy_start()) and it
> suddently has to interract with it through the mdiobb_ops callbacks.
>
> Also, I just re-tested this use case in which I do not start the
> interface and just issue a reboot, and it behaves as expected.

It depends on the hardware: the sh_eth device is powered down when its
module clock is stopped. When powered down, any access to the sh_eth
registers or to the PHY connected to it will cause a crash.

On most other hardware, you can access the PHY regardless, and no crash
will happen.

> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -2962,7 +2962,8 @@ static void phy_shutdown(struct device *dev)
> >  {
> >       struct phy_device *phydev = to_phy_device(dev);
> >
> > -     phy_disable_interrupts(phydev);
> > +     if (phy_is_started(phydev))
> > +             phy_disable_interrupts(phydev);
> >  }
> >
> >  /**

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux