On Fri, Jul 02, 2021 at 06:03:20PM +0100, John Garry wrote: > Hi guys, > > We're experiencing a deadlock between trying to remove a SATA device and > doing a rescan in scsi_rescan_device(). > > I'm just looking for a suggestion on how to solve. > > The background is that the host (hisi sas v3 hw) uses SAS SCSI transport and > supports RPM. In the testcase, the host and disks are put to suspend. Then > we run fio on the disk to make them active and then immediately hard reset > the disk link, which causes the disk to be disconnected (please don't ask > why ...). > > We find that there is a conflict between the rescan and the device removal > code, resulting in a deadlock: > The rescan holds the sdev_gendev.device lock in scsi_rescan_device(), while > the removal code in __scsi_remove_device() wants to grab it. > > However the rescan will not release (the lock) until the blk_queue_enter() > succeeds, above. That can happen 2x ways: > > - the queue is dying, which would not happen until after the device_del() in > __scsi_remove_device(), so not going to happen > > - q->pm_only falls to 0. This would be when scsi_runtime_resume() -> > sdev_runtime_resume() -> blk_post_runtime_resume(err = 0) -> > blk_set_runtime_active() is called. However, I find that the err argument > for me is -5, which comes from sdev_runtime_resume() -> pm->runtime_resume > (=sd_resume()), which fails. That sd_resume() -> sd_start_stop_device() > fails as the disk is not attached. So we go into error state: > > $:more /sys/devices/pci0000:b4/0000:b4:04.0/host3/port-3:0/end_device-3:0/target3:0:0/3:0:0:0/power/runtime_status > error > > Removing commit e27829dc92e5 ("scsi: serialize ->rescan against ->remove") > solves this issue for me, but that is there for a reason. > > Any suggestion on how to fix this deadlock? This is indeed a tricky question. It seems like we should allow a runtime resume to succeed if the only reason it failed was that the device has been removed. More generally, perhaps we should always consider that a runtime resume succeeds. Any remaining problems will be dealt with by the device's driver and subsystem once the device is marked as runtime-active again. Suppose you try changing blk_post_runtime_resume() so that it always calls blk_set_runtime_active() regardless of the value of err. Does that fix the problem? And more importantly, will it cause any other problems...? Alan Stern