On Thu, Jun 15, 2023 at 05:33:26PM +0900, Damien Le Moal wrote: > When an ATA port is resumed from sleep, the port is reset and a power > management request issued to libata EH to reset the port and rescanning > the device(s) attached to the port. Device rescanning is done by > scheduling an ata_scsi_dev_rescan() work, which will execute > scsi_rescan_device(). > > However, scsi_rescan_device() takes the generic device lock, which is > also taken by dpm_resume() when the SCSI device is resumed as well. If > a device rescan execution starts before the completion of the SCSI > device resume, the rcu locking used to refresh the cached VPD pages of > the device, combined with the generic device locking from > scsi_rescan_device() and from dpm_resume() can cause a deadlock. > > Avoid this situation by changing struct ata_port scsi_rescan_task to be > a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is > modified to check if the SCSI device associated with the ATA device that > must be rescanned is not suspended. If the SCSI device is still > suspended, ata_scsi_dev_rescan() returns early and reschedule itself for > execution after an arbitrary delay of 5ms. I don't understand the nature of the relationship between the ATA port and the corresponding SCSI device. Maybe you could explain it more fully, if you have time. But in any case, this approach seems like a layering violation. Why not instead call a SCSI utility routine to set a "needs_rescan" flag in the scsi_device structure? Then scsi_device_resume() could automatically call scsi_rescan_device() -- or rather an internal version that assumes the device lock is already held -- if the flag is set. Or it could queue a non-delayed work routine to do this. (Is it important to have the rescan finish before userspace starts up and tries to access the ATA device again?) That, combined with a guaranteed order of resuming, would do what you want, right? Alan Stern