On Fri, May 13, 2022 at 4:37 PM Vladimir Oltean <vladimir.oltean@xxxxxxx> wrote: > > This patch set completes the picture described by > '[RFC,devicetree] of: property: mark "interrupts" as optional for fw_devlink' > https://patchwork.kernel.org/project/netdevbpf/patch/20220513201243.2381133-1-vladimir.oltean@xxxxxxx/ I replied to that patch. I don't think we can pull that in. > I've CCed non-networking maintainers just in case they want to gain a > better understanding. If not, apologies and please ignore the rest. > > > > My use case is to migrate a PHY driver from poll mode to interrupt mode > without breaking compatibility between new device trees and old kernels > which did not have a driver for that IRQ parent, and therefore (for > things to work) did not even have that interrupt listed in the "vintage > correct" DT blobs. Note that current kernels as of today are also > "old kernels" in this description. > > Creating some degree of compatibility has multiple components. > > 1. A PHY driver must eventually give up waiting for an IRQ provider, > since the dependency is optional and it can fall back to poll mode. > This is currently supported thanks to commit 74befa447e68 ("net: > mdio: don't defer probe forever if PHY IRQ provider is missing"). > > 2. Before it finally gives up, the PHY driver has a transient phase of > returning -EPROBE_DEFER. That transient phase causes some breakage > which is handled by this patch set, details below. > > 3. PHY device probing and Ethernet controller finding it and connecting > to it are async events. When both happen during probing, the problem > is that finding the PHY fails if the PHY defers probe, which results > in a missing PHY rather than waiting for it. Unfortunately there is > no universal way to address this problem, because the majority of > Ethernet drivers do not connect to the PHY during probe. So the > problem is fixed only for the driver that is of interest to me in > this context, DSA, and with special API exported by phylink > specifically for this purpose, to limit the impact on other drivers. I'll take a closer look at this later this week, but once we add phy-handle support to fw_devlink (the device_bind_driver() is making it hard to add support), I think we can address most/all of these problems automatically. So hopefully we can work towards that? Actually this patch might already fix this for you: https://lore.kernel.org/lkml/20220429220933.1350374-1-saravanak@xxxxxxxxxx/ Before fw_devlink, we'd give up on waiting on all suppliers, whether they had a driver (but hadn't yet probed for a multitude of reasons) or not. fw_devlink is smart about allowing consumers to probe without their suppliers only if the supplier has no driver or the driver fails (I'll send a patch for this). The deferred_probe_timeout is what's used to decide when to give up waiting for drivers. -Saravana > > Note that drivers that connect to the PHY at ndo_open are superficially > "fixed" by the patch at step 1 alone, and therefore don't need the > mechanism introduced in phylink here. This is because of the larger span > of time between PHY probe and opening the network interface (typically > initiated by user space). But this is the catch, nfsroot and other > in-kernel networking users can also open the net device, and this will > still expose the EPROBE_DEFER as a hard error for this second kind of > drivers. I don't know how to fix that. From this POV, it's better to do > what DSA does (connect to the PHY on probe). > > Vladimir Oltean (2): > net: phylink: allow PHY driver to defer probe when connecting via OF > node > net: dsa: wait for PHY to defer probe > > drivers/net/phy/phylink.c | 73 ++++++++++++++++++++++++++++++--------- > include/linux/phylink.h | 2 ++ > net/dsa/dsa2.c | 2 ++ > net/dsa/port.c | 6 ++-- > net/dsa/slave.c | 10 +++--- > 5 files changed, 70 insertions(+), 23 deletions(-) > > -- > 2.25.1 >