On Thu, Dec 19, 2024 at 09:36:32AM +0100, Johan Hovold wrote: > On Thu, Dec 19, 2024 at 12:05:55AM +0530, Manivannan Sadhasivam wrote: > > On Wed, Dec 18, 2024 at 03:26:38PM +0100, Johan Hovold wrote: > > > On Wed, Dec 18, 2024 at 07:39:10PM +0530, Manivannan Sadhasivam wrote: > > > > On Wed, Dec 18, 2024 at 02:55:02PM +0100, Johan Hovold wrote: > > > > > > But that's not going to happen as that reset is what is currently > > > > > causing the deadlock and which would simply be skipped if you switch to > > > > > pci_try_reset_function(). > > > > > > > > > > > > > mhi_pci_runtime_resume() will queue the recovery_work() and return. So I was > > > > hoping that by the time pci_try_reset_function() is called, the lock would be > > > > available. > > > > > > We can't rely on luck with timings, and this is the very reason for the > > > deadlock I'm currently seeing (i.e. the recovery thread is still running > > > when another thread grabs the lock and waits for the recovery thread to > > > finish). > > > > > > Perhaps the recovery work should be done synchronously in the resume > > > handler to avoid such issues. > > > > Synchronously? How can that help when the recovery_work() cannot acquire the > > lock? > > During system suspend, pm core waits for any on-going runtime resume > operations to complete before taking the device lock and suspending the > device. > Right, but mhi_pci_runtime_resume() is also called from mhi_pci_resume(). So we cannot safely carry out the recovery_work() synchronously without the pci_try_reset_function() change. > Unfortunately, that's currently not the case during shutdown() where > those operations are reversed, so that would indeed need to be addressed > first. > > But what the driver is currently doing looks highly questionable as it > returns success when it failed to resume the device (after scheduling > the asynchronous recovery work). > I completely agree and this goes against what PM core expects. IMO we need two fixes, one uses pci_try_reset_function() and another recovers the device synchronously from mhi_pci_runtime_resume() and passes the return value to PM core. Will post the patches. - Mani -- மணிவண்ணன் சதாசிவம்