On Fri, Aug 27, 2021 at 6:44 AM Andrew Lunn <andrew@xxxxxxx> wrote: > > > fw_devlink=on/device links short circuits the probe() call of a > > consumer (in this case the PHY) and returns -EPROBE_DEFER if the > > supplier's (in this case switch) probe hasn't finished without an > > error. fw_devlink/device links effectively does the probe in graph > > topological order and there's a ton of good reasons to do it that way > > -- what's why fw_devlink=on was implemented. > > > > In this specific case though, since the PHY depends on the parent > > device, if we fail the parent's probe realtek_smi_probe() because the > > PHYs failed to probe, we'll get into a catch-22/chicken-n-egg > > situation and the switch/PHYs will never probe. > > So lets look at: > > arch/arm/boot/dts/vf610-zii-dev-rev-b.dts > > mdio-mux { > compatible = "mdio-mux-gpio"; > pinctrl-0 = <&pinctrl_mdio_mux>; > pinctrl-names = "default"; > gpios = <&gpio0 8 GPIO_ACTIVE_HIGH > &gpio0 9 GPIO_ACTIVE_HIGH > &gpio0 24 GPIO_ACTIVE_HIGH > &gpio0 25 GPIO_ACTIVE_HIGH>; > mdio-parent-bus = <&mdio1>; > #address-cells = <1>; > #size-cells = <0>; > > > We have an MDIO multiplexor > > > mdio_mux_1: mdio@1 { > reg = <1>; > #address-cells = <1>; > #size-cells = <0>; > > switch0: switch@0 { > compatible = "marvell,mv88e6085"; > pinctrl-0 = <&pinctrl_gpio_switch0>; > pinctrl-names = "default"; > reg = <0>; > dsa,member = <0 0>; > interrupt-parent = <&gpio0>; > interrupts = <27 IRQ_TYPE_LEVEL_LOW>; > > On the first bus, we have a Ethernet switch. > > interrupt-controller; > #interrupt-cells = <2>; > eeprom-length = <512>; > > ports { > #address-cells = <1>; > #size-cells = <0>; > > port@0 { > reg = <0>; > label = "lan0"; > phy-handle = <&switch0phy0>; > }; > > The first port of that switch has a pointer to a PHY. > > mdio { > #address-cells = <1>; > #size-cells = <0>; > > That Ethernet switch also has an MDIO bus, > > switch0phy0: switch0phy0@0 { > reg = <0>; > > On that bus is the PHY. > > interrupt-parent = <&switch0>; > interrupts = <0 IRQ_TYPE_LEVEL_HIGH>; > > And that PHY has an interrupt. And that interrupt is provided by the switch. > > Given your description, it sounds like this is also go to break. Based on what you pasted here (I didn't look any closer), I think it will break too. > > vf610-zii-dev-rev-c.dts is the same pattern, and there are more > examples for mv88e6xxx. > > It is a common pattern, e.g. the mips ar9331.dtsi follows it. Then I think this should be solved at the DSA framework level. Make a component-master/aggregate device made up of the switches and ports/PHYs. Then wait for all of them to not -EPROBE_DEFER and then initialize the DSA? > I've not yet looked at plain Ethernet drivers. This pattern could also > exist there. And i wonder about other complex structures, i2c bus > multiplexors, you can have interrupt controllers as i2c devices, > etc. So the general case could exist in other places. I haven't seen any generic issues like this reported so far. It's only after adding phy-handle that we are hitting these issues with DSA switches. > I don't think we should be playing whack-a-mole by changing drivers as > we find they regress and break. We need a generic fix. I think the > solution is pretty clear. As you said the device depends on its > parent. DT is a tree, so it is easy to walk up the tree to detect this > relationship, and not fail the probe. It's easy to do, but it is the wrong behavior for fw_devlink=on. There are plenty of cases where it's better to delay the child device's probe until the parent finishes. You even gave an example[7] where it would help avoid unnecessary deferred probes. There are plenty of other cases like this too -- there's actually a USB driver that had an infinite deferred probe loop that fw_devlink=on fixes. Also, the whole point of fw_devlink=on is to enforce ordering like this -- so just blanket ignoring dependencies on parent devices doesn't make sense. But a parent device's probe depending on a child device's probe to succeed as soon as it's added is never right though. So I think that's what needs to be addresses. So we have a couple of options: 1. Use a component driver model to initialize switches. I think it could be doable at the DSA framework level. 2. Ask fw_devlink=on to ignore it for all switch devices -- it might be possible to move my "quick fix" to the DSA framework. 3. Remove fw_devlink support for phy-handle. I honestly think (1) is the best option and makes sense logically too. Not saying it's a trivial work or a one liner, but it actually makes sense. (2) might not be possible -- I need to take a closer look. I'd prefer not doing (3), but I'd take that over breaking the whole point of fw_devlink=on. -Saravana [7] - https://lore.kernel.org/netdev/YR5eMeKzcuYtB6Tk@xxxxxxx/