Re: [BUG bisect] Missing Micrel driver on VF50 (net: phy: check return code when requesting PHY driver module)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Heiner,

On Fri, Jan 18, 2019 at 9:58 PM Heiner Kallweit <hkallweit1@xxxxxxxxx> wrote:
> On 18.01.2019 09:48, Krzysztof Kozlowski wrote:
> > On Fri, 18 Jan 2019 at 09:39, Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote:
> >> On today's next (next-20190118) my Colibri VF50 board fails to boot up
> >> from network (DHCP, NFSv4 root). Looks like missing network adapter.
> >> Expected:
> >> [ 3.041773] Micrel KSZ8041 400d1000.ethernet-1:00: attached PHY driver
> >> [Micrel KSZ8041] (mii_bus:phy_addr=400d1000.ethernet-1:00, irq=POLL)
> >>
> >> Result:
> >> [ 15.614964] Root-NFS: no NFS server address
> >> [ 15.619353] VFS: Unable to mount root fs via NFS, trying floppy.
> >> [ 15.626762] VFS: Cannot open root device "nfs" or unknown-block(2,0): error -6
> >> [ 15.634252] Please append a correct "root=" boot option; here are the
> >> available partitions:
> >> [ 15.642791] 0100 16384 ram0
> >> [ 15.642804] (driver?)
> >> [ 15.649076] Kernel panic - not syncing: VFS: Unable to mount root fs
> >> on unknown-block(2,0)
> >> [ 15.657423] ---[ end Kernel panic - not syncing: VFS: Unable to mount
> >> root fs on unknown-block(2,0) ]---
> >>
> >> Bisect pointed to:
> >>     net: phy: check return code when requesting PHY driver module
> >
> > I see now in the logs:
> > [ 2.436822] mdio_bus 400d1000.ethernet-1:00: error -2 loading PHY
> > driver module for ID 0x00221513
> > which is kind of misleading. There is no initramfs so there is no
> > usermod library at this point. It is not needed. This seems to break
> > all DHCP/NFS root boots without initrd/initramfs.
> >
> Thanks for the report. Could you please provide your kernel config
> and the syslog of a boot before or w/o the patch in question?
>
> If you boot via nfs then I'd expect that the PHY driver is built-in and
> not a module. Therefore it's not fully clear to me yet why
> request_module() returns -ENOENT.

I'm seeing the same booting nfsroot on several Renesas boards.
E.g. on r8a7791/koelsch:

    mdio_bus ee700000.ethernet-ffffffff:01: error -2 loading PHY
driver module for ID 0x00221537
    sh-eth ee700000.ethernet: MDIO init failed: -2

This failure happens only when CONFIG_MODULES=y.
Reverting commit 13d0ab6750b20957 ("net: phy: check return code when
requesting PHY driver module") fixes the issue.

phy_request_driver_module() tries to load module
"mdio:00000000001000100001010100110111", which fails.
When CONFIG_MODULES=n, the error is ignored, and everything works fine.

0b00000000001000100001010100110111 == 0x00221537 == PHY_ID_KSZ8041RNLI,
which is served by drivers/net/phy/micrel.c.
Interestingly, CONFIG_MICREL_PHY=y, so I'm wondering why the PHY subsystem
tries to load a module for a driver which is already present in the first
place?

Oh, the following comment tries to explain:

        /* Request the appropriate module unconditionally; don't
         * bother trying to do so only if it isn't already loaded,
         * because that gets complicated. A hotplug event would have
         * done an unconditional modprobe anyway.

Hence request_module() failures are normal.

       ret = request_module(MDIO_MODULE_PREFIX MDIO_ID_FMT,
                            MDIO_ID_ARGS(phy_id));
       /* we only check for failures in executing the usermode binary,
        * not whether a PHY driver module exists for the PHY ID
        */
       if (IS_ENABLED(CONFIG_MODULES) && ret < 0) {
               phydev_err(dev, "error %d loading PHY driver module for
ID 0x%08x\n",
                          ret, phy_id);
               return ret;
       }

However:

    /**
     * __request_module - try to load a kernel module
     * @wait: wait (or not) for the operation to complete
     * @fmt: printf style format string for the name of the module
     * @...: arguments as specified in the format string
     *
     * Load a module using the user mode module loader. The function returns
     * zero on success or a negative errno code or positive exit code from
     * "modprobe" on failure.

So perhaps the check should be for "ret > 0"?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux