On Tue, Jan 23, 2024 at 06:25:52PM +0100, Johan Hovold wrote: > On Mon, Jan 22, 2024 at 12:26:15PM -0600, Bjorn Helgaas wrote: > > On Mon, Jan 22, 2024 at 11:53:35AM +0100, Johan Hovold wrote: > > > I never got a reply to this one so resending with updated Subject in > > > case it got buried in your inbox. > > > > I did see it but decided it was better to fix the problem with resume > > causing an unintended reboot, even though fixing that meant breaking > > lockdep again, since I don't think we have user reports of the > > potential deadlock lockdep finds. > > That may be because I fixed the previous regression in 6.7-rc1 before > any users had a chance to hit the deadlock on Qualcomm platforms. > > I can easily trigger a deadlock on the X13s by instrumenting 6.7-final > with a delay to increase the race window. > > And any user hitting this occasionally is likely not going to be able to > track it down to this lock inversion (unless they have lockdep enabled). I agree, it's a problem we need to fix. > > 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") was a > > start at fixing other problems and also improving the ASPM style, so I > > hope somebody steps up to fix both it and the lockdep issue. I > > haven't looked at it enough to have a preference for *how* to fix it. > > Ok, but since you were the one introducing the locking regression in > 6.7-final shouldn't you look into fixing it? > > Especially if there were alternatives to restoring the offending commit > which would solve the underlying issue for the resume failure without > breaking other platforms. Did somebody propose an alternate patch? If so, I missed it, but we could look at it now. > I don't want to spend more time on this if the offending commit could > simply be reverted. I don't quite follow. By simply reverting, do you mean to revert f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")? IIUC that would break Michael's machine again. Bjorn