On Thu, Sep 02, 2021 at 10:07:49PM +0200, Andrew Lunn wrote: > > The interrupt controller _has_ been set up. The trouble is that the > > interrupt controller has the same OF node as the switch itself, and the > > same OF node. Therefore, fw_devlink waits for the _entire_ switch to > > finish probing, it doesn't have insight into the fact that the > > dependency is just on the interrupt controller. > > That seems to be the problem. fw_devlink appears to think probe is an > atomic operation. A device is not probed, or full probed. Where as the > drivers are making use of it being non atomic. > > Maybe fw_devlink needs the third state, probing. And when deciding if > a device can be probed and depends on a device which is currently > probing, it looks deeper, follows the phandle and see if the resource > is actually available? This is interesting because there already exists a device link state for when the consumer is "probing", but for the supplier, it's binary: /** * enum device_link_state - Device link states. * @DL_STATE_NONE: The presence of the drivers is not being tracked. * @DL_STATE_DORMANT: None of the supplier/consumer drivers is present. * @DL_STATE_AVAILABLE: The supplier driver is present, but the consumer is not. * @DL_STATE_CONSUMER_PROBE: The consumer is probing (supplier driver present). * @DL_STATE_ACTIVE: Both the supplier and consumer drivers are present. * @DL_STATE_SUPPLIER_UNBIND: The supplier driver is unbinding. */ The check that's killing us is in device_links_check_suppliers, and is for DL_STATE_AVAILABLE: list_for_each_entry(link, &dev->links.suppliers, c_node) { if (!(link->flags & DL_FLAG_MANAGED)) continue; if (link->status != DL_STATE_AVAILABLE && !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) { device_links_missing_supplier(dev); dev_err(dev, "probe deferral - supplier %s not ready\n", dev_name(link->supplier)); ret = -EPROBE_DEFER; break; } WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE); } Anyway, I was expecting quite a different reaction from this patch series, and especially one from Saravana. We are essentially battling to handle an -EPROBE_DEFER we don't need (the battle might be worth it though, in the general case, which is one of the reasons I posted them). But these patches also solve DSA's issue with the circular dependency between the switch and its internal PHYs, and nobody seems to have asked the most important question: why? The PHY should return -EPROBE_DEFER ad infinitum, since its supplier has never finished probing by the time it calls phy_attach_direct.