From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> Currently, in mhi_pci_runtime_resume(), if the resume fails, recovery_work is started asynchronously and success is returned. But this doesn't align with what PM core expects as documented in Documentation/power/runtime_pm.rst: "Once the subsystem-level resume callback (or the driver resume callback, if invoked directly) has completed successfully, the PM core regards the device as fully operational, which means that the device _must_ be able to complete I/O operations as needed. The runtime PM status of the device is then 'active'." So the PM core ends up marking the runtime PM status of the device as 'active', even though the device is not able to handle the I/O operations. This same condition more or less applies to system resume as well. So to avoid this ambiguity, try to recover the device synchronously from mhi_pci_runtime_resume() and return the actual error code in the case of recovery failure. For doing so, move the recovery code to __mhi_pci_recovery_work() helper and call that from both mhi_pci_recovery_work() and mhi_pci_runtime_resume(). Former still ignores the return value, while the latter passes it to PM core. Cc: stable@xxxxxxxxxxxxxxx # 5.13 Reported-by: Johan Hovold <johan@xxxxxxxxxx> Closes: https://lore.kernel.org/mhi/Z2PbEPYpqFfrLSJi@xxxxxxxxxxxxxxxxxxxx Fixes: d3800c1dce24 ("bus: mhi: pci_generic: Add support for runtime PM") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> --- drivers/bus/mhi/host/pci_generic.c | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/bus/mhi/host/pci_generic.c b/drivers/bus/mhi/host/pci_generic.c index e92df380c785..f6de407e077e 100644 --- a/drivers/bus/mhi/host/pci_generic.c +++ b/drivers/bus/mhi/host/pci_generic.c @@ -997,10 +997,8 @@ static void mhi_pci_runtime_put(struct mhi_controller *mhi_cntrl) pm_runtime_put(mhi_cntrl->cntrl_dev); } -static void mhi_pci_recovery_work(struct work_struct *work) +static int __mhi_pci_recovery_work(struct mhi_pci_device *mhi_pdev) { - struct mhi_pci_device *mhi_pdev = container_of(work, struct mhi_pci_device, - recovery_work); struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl; struct pci_dev *pdev = to_pci_dev(mhi_cntrl->cntrl_dev); int err; @@ -1035,13 +1033,25 @@ static void mhi_pci_recovery_work(struct work_struct *work) set_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status); mod_timer(&mhi_pdev->health_check_timer, jiffies + HEALTH_CHECK_PERIOD); - return; + + return 0; err_unprepare: mhi_unprepare_after_power_down(mhi_cntrl); err_try_reset: - if (pci_try_reset_function(pdev)) + err = pci_try_reset_function(pdev); + if (err) dev_err(&pdev->dev, "Recovery failed\n"); + + return err; +} + +static void mhi_pci_recovery_work(struct work_struct *work) +{ + struct mhi_pci_device *mhi_pdev = container_of(work, struct mhi_pci_device, + recovery_work); + + __mhi_pci_recovery_work(mhi_pdev); } static void health_check(struct timer_list *t) @@ -1400,15 +1410,10 @@ static int __maybe_unused mhi_pci_runtime_resume(struct device *dev) return 0; err_recovery: - /* Do not fail to not mess up our PCI device state, the device likely - * lost power (d3cold) and we simply need to reset it from the recovery - * procedure, trigger the recovery asynchronously to prevent system - * suspend exit delaying. - */ - queue_work(system_long_wq, &mhi_pdev->recovery_work); + err = __mhi_pci_recovery_work(mhi_pdev); pm_runtime_mark_last_busy(dev); - return 0; + return err; } static int __maybe_unused mhi_pci_suspend(struct device *dev) -- 2.25.1