On Mon, Feb 20, 2023 at 07:17:11PM +0000, Russell King (Oracle) wrote: > On Mon, Feb 20, 2023 at 06:29:33PM +0000, Marc Zyngier wrote: > > Lockdep also reports[1] a possible circular locking dependency between > > phy_attach_direct() and rtnetlink_rcv_msg(), which looks interesting. > > > > [1] https://paste.debian.net/1271454/ > > Adding Andrew, but really this should be in a separate thread, since > this has nothing to do with MSI. > > It looks like the open path takes the RTNL lock followed by the phydev > lock, whereas the PHY probe path takes the phydev lock, and then if > there's a SFP attached to the PHY, we end up taking the RTNL lock. > That's going to be utterly horrid to try and solve, and isn't going > to be quick to fix. What are we actually trying to protect in phy_probe() when we take the lock and call phydev->drv->probe(phydev) ? The main purpose of the lock is to protect members of phydev, such as link, speed, duplex, which can be inconsistent when the lock is not held. But the PHY is not attached to a MAC yet, so a MAC cannot be using it, and those members of phydev are not valid yet anyway. The lock also prevents parallel operation on the device by phylib, but i cannot think of how that could happen at this early stage in the life of the PHY. So maybe we can move the mutex_lock() after the call to phydev->drv->probe()? Andrew