Greg, Florian, Vladimir, Alvin, Let's continue the rest of the discussion here. On Thu, Aug 26, 2021 at 6:13 AM Andrew Lunn <andrew@xxxxxxx> wrote: > > On Thu, Aug 26, 2021 at 12:45:24AM -0700, Saravana Kannan wrote: > > If a parent device is also a supplier to a child device, fw_devlink=on > > (correctly) delays the probe() of the child device until the probe() of > > the parent finishes successfully. > > > > However, some drivers of such parent devices (where parent is also a > > supplier) incorrectly expect the child device to finish probing > > successfully as soon as they are added using device_add() and before the > > probe() of the parent device has completed successfully. > > Please can you point at the code making this assumption. It sounds > like we are missing some EPROBE_DEFER handling in the driver, or maybe > the DSA framework. For context, this was discussed and explained in [1] and subsequent replies. But let me summarize it here. Alvin reported an issue that with fw_devlink=on, his downstream hardware which is very similar to [2] doesn't have its PHYs probed correctly. Instead of the PHYs being probed by the specific driver, it gets probed by the "Generic PHY" driver. For those who aren't very familiar with PHYs/networking (this is based on what Andrew explained to me earlier), Ethernet PHYs follow a specific standard and can have some extended functionality. The specific driver would give the full functionality, but if it's not available when the PHY needs to be used/connected, the generic PHY driver is force bound to the PHY and it gives the basic functionality. So upon digging into this, this is what I found and where I think we have some bad assumptions about the driver core are present: The DT node in [2] is probed by realtek_smi_probe() [3]. The call flow is: realtek_smi_probe() -> dsa_register_switch() -> dsa_switch_probe() -> dsa_tree_setup() -> dsa_tree_setup_switches() -> dsa_switch_setup() -> ds->ops->setup(ds) -> rtl8366rb_setup() -> realtek_smi_setup_mdio() -> of_mdiobus_register() This scans the MDIO bus/DT and device_add()s the PHYs -> dsa_port_setup() -> dsa_port_link_register_of() -> dsa_port_phylink_register() -> phylink_of_phy_connect() -> phylink_fwnode_phy_connect() -> phy_attach_direct() This checks if PHY device has already probed (by checking for dev->driver). If not, it forces the probe of the PHY using one of the generic PHY drivers. So within dsa_register_switch() the PHY device is added and then expected to have probed in the same thread/calling context. As stated earlier, this is not guaranteed by the driver core. And this is what needs fixing. This works as long as the PHYs don't have dependencies on any other devices/suppliers and never defer probe. In the issue Alvin reported, the PHYs have a dependency and things fall apart. I don't have a strong opinion on whether this is a framework level fix or fixes in a few drivers. In the specific instance of [2] (providing snippet below to make it easier to follow), the "phy0" device [4] depends on the "switch" device [2] since "switch_intc" (the interrupt provider for phy0) is initialized by the "switch" driver. And fw_devlink=on delays the probe of phy0 until switch[2] finishes probing successfully (i.e. after dsa_register_switch() <- realtek_smi_probe() returns) -- this is the whole point of fw_devlink=on this is what reduces the useless deferred probes/probe attempts of consumers before the suppliers finish probing successfully. Since dsa_register_switch() assumes the PHYs would have been probed as soon as they are added, but they aren't probed in this case, the PHY is force bound to the generic PHY driver. Which is the original issue Alvin reported. Hope this clears things up for everyone. -Saravana switch { compatible = "realtek,rtl8366rb"; ... switch_intc: interrupt-controller { ... }; ports { ... port@0 { phy-handle = <&phy0>; }; port@1 { }; ... }; mdio { compatible = "realtek,smi-mdio"; ... phy0: phy@0 { ... interrupt-parent = <&switch_intc>; interrupts = <0>; }; ... }; }; [1] - https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@xxxxxxxxxxxxxx/ [2] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/gemini-dlink-dir-685.dts?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n190 [3] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/dsa/realtek-smi-core.c?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n386 [4] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/gemini-dlink-dir-685.dts?id=73f3af7b4611d77bdaea303fb639333eb28e37d7#n255