Hi Lukas, On 23.05.2022 11:43, Lukas Wunner wrote: > On Thu, May 19, 2022 at 11:22:36PM +0200, Marek Szyprowski wrote: >> On 19.05.2022 21:08, Lukas Wunner wrote: >>> Taking a step back though, I'm wondering if there's a bigger problem here: >>> This is a USB device, so we stop receiving interrupts once the Interrupt >>> Endpoint is no longer polled. But what if a PHY's interrupt is attached >>> to a GPIO of the SoC and that interrupt is raised while the system is >>> suspending? The interrupt handler may likewise try to reach an >>> inaccessible (suspended) device. >>> >>> The right thing to do would probably be to signal wakeup. But the >>> PHY drivers' irq handlers instead schedule the phy_state_machine(). >>> Perhaps we need something like the following at the top of >>> phy_state_machine(): >>> >>> if (phydev->suspended) { >>> pm_wakeup_dev_event(&phydev->mdio.dev, 0, true); >>> return; >>> } >>> >>> However, phydev->suspended is set at the *bottom* of phy_suspend(), >>> it would have to be set at the *top* of mdio_bus_phy_suspend() >>> for the above to be correct. Hmmm... >> Well, your concern sounds valid, but I don't have a board with such hw >> configuration, so I cannot really test. > I'm torn whether I should submit the quick fix in my last e-mail > or attempt to address the deeper issue. The quick fix would ensure > v5.19-rc1 isn't broken, but if possible I'd rather address the deeper > issue... > > Below is another patch. Would you mind testing if it fixes the problem > for you? It's a replacement for the patch in my last e-mail and seeks > to fix the problem for all drivers, not just smsc95xx. If you don't > have time to test it, let me know and I'll just submit the quick fix > in my previous e-mail. I've just tested it on top of next-20220519 and I was not able to reproduce the issue, so it looks it also fixes the issue. :) Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> > BTW, getting a PHY interrupt on suspend seems like a corner case to me, > so I'm amazed you found this and seem to be able to reproduce it 100%. > Out of curiosity, is this a CI test you're performing? I've have some semi-automated (based on simple bash scripts) tests utilizing remote test boards (with remote power on/off control, serial console, tftp booting). This issue was quite easy to reproduce, even manually. Maybe it is somehow specific to the Odroid-XU3/XU3-lite boards and the way the smsc95xx USB ethernet chip is connected there, but it happens there usually in 2 of 3 suspend/resume tests. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland