From: Peter Wang <peter.wang@xxxxxxxxxxxx> When wlun resume SSU(Active) timeout, scsi try eh_host_reset_handler. But ufshcd_eh_host_reset_handler hang at wait flush_work(&hba->eh_work). And ufshcd_err_handler hang at wait rpm resume. Do link recovery only in this case. Below is IO hang stack dump. <ffffffdd78e02b34> schedule+0x110/0x204 <ffffffdd78e0be60> schedule_timeout+0x98/0x138 <ffffffdd78e040e8> wait_for_common_io+0x130/0x2d0 <ffffffdd77d6a000> blk_execute_rq+0x10c/0x16c <ffffffdd78126d90> __scsi_execute+0xfc/0x278 <ffffffdd7813891c> ufshcd_set_dev_pwr_mode+0x1c8/0x40c <ffffffdd78137d1c> __ufshcd_wl_resume+0xf0/0x5cc <ffffffdd78137ae0> ufshcd_wl_runtime_resume+0x40/0x18c <ffffffdd78136108> scsi_runtime_resume+0x88/0x104 <ffffffdd7809a4f8> __rpm_callback+0x1a0/0xaec <ffffffdd7809b624> rpm_resume+0x7e0/0xcd0 <ffffffdd7809a788> __rpm_callback+0x430/0xaec <ffffffdd7809b644> rpm_resume+0x800/0xcd0 <ffffffdd780a0778> pm_runtime_work+0x148/0x198 <ffffffdd78e02b34> schedule+0x110/0x204 <ffffffdd78e0be10> schedule_timeout+0x48/0x138 <ffffffdd78e03d9c> wait_for_common+0x144/0x2dc <ffffffdd7758bba4> __flush_work+0x3d0/0x508 <ffffffdd7815572c> ufshcd_eh_host_reset_handler+0x134/0x3a8 <ffffffdd781216f4> scsi_try_host_reset+0x54/0x204 <ffffffdd78120594> scsi_eh_ready_devs+0xb30/0xd48 <ffffffdd7812373c> scsi_error_handler+0x260/0x874 <ffffffdd78e02b34> schedule+0x110/0x204 <ffffffdd7809af64> rpm_resume+0x120/0xcd0 <ffffffdd7809fde8> __pm_runtime_resume+0xa0/0x17c <ffffffdd7815193c> ufshcd_err_handling_prepare+0x40/0x430 <ffffffdd7814cce8> ufshcd_err_handler+0x1c4/0xd4c Signed-off-by: Peter Wang <peter.wang@xxxxxxxxxxxx> --- drivers/ufs/core/ufshcd.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index e18c9f4463ec..5aaffd13e132 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -7363,9 +7363,27 @@ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd) int err = SUCCESS; unsigned long flags; struct ufs_hba *hba; + struct device *dev; hba = shost_priv(cmd->device->host); + /* + * If __ufshcd_wl_suspend get fail and runtime_status = RPM_RESUMING, + * do link recovery only. Because schedule eh work will get dead lock + * in ufshcd_rpm_get_sync to wait wlun resume, but wlun resume get + * error and wait eh work finish. + */ + dev = &hba->sdev_ufs_device->sdev_gendev; + if (dev->power.runtime_status == RPM_RESUMING) { + err = ufshcd_link_recovery(hba); + if (err) { + dev_err(hba->dev, "WL Device PM: status:%d, err:%d\n", + dev->power.runtime_status, + dev->power.runtime_error); + } + return err; + } + spin_lock_irqsave(hba->host->host_lock, flags); hba->force_reset = true; ufshcd_schedule_eh_work(hba); -- 2.18.0