On 6/15/23 03:04, Bart Van Assche wrote: > On 6/14/23 07:26, Alan Stern wrote: >> On Wed, Jun 14, 2023 at 04:35:50PM +0900, Damien Le Moal wrote: >>> Or... Why the heck scsi_rescan_device() is calling device_lock() ? This >>> is the only place in scsi code I can see that takes this lock. I suspect >>> this is to serialize either rescans, or serialize with resume, or both. >>> For serializing rescans, we can use another lock. For serializing with >>> PM, we should wait for PM transitions... >>> Something is not right here. >> >> Here's what commit e27829dc92e5 ("scsi: serialize ->rescan against >> ->remove", written by Christoph Hellwig) says: >> >> Lock the device embedded in the scsi_device to protect against >> concurrent calls to ->remove. >> >> That's the commit which added the device_lock() call. > > Even if scsi_rescan_device() would use another mechanism for > serialization against sd_remove() and sr_remove(), we still need to > solve the issue that the ATA code calls scsi_rescan_device() before > resuming has finished. scsi_rescan_device() issues I/O. Issuing I/O to a > device is not allowed before that device has been resumed. I am not convinced of that: scsi suspend quiecse the queue, thus preventing IOs from the block layer, but not internale scsi ml commands, which is what scsi_rescan_device() issues. In any case, I am thinking that best (and quickest) fix for this issue for now is to have libata define a device link to make the scsi device a "parent" of the ata device (which is the ata link as of now). This way, PM operation ordering will ensure that the scsi device resume will be done before the ata device. What I really do not like about this though is that the suspend side would be done in the reverse order: ata first and then scsi, but we really want the reverse here to ensure that the request queue is quiesced before we suspend ata. That said, there is no such synchronization right now and so this is probably happening already without raising issues apparently. So ideally: 1) Make the ata device the parent of the scsi device using a device link 2) For suspend, the scsi device suspend will be done first, followed by the ata device, which is what we want. 3) For resume, ata device will be first, followed by scsi device. The call to scsi_rescan_device() from libata being in a work task, asynchronous from the ata resume context, we need to synchronize that work to wait for the scsi device resume to complete. (but do we really given that we are going to issue internal commands only ?) Alan, Rafael, For the synchronization of step (3), if I understand the pm code correctly, using device_pm_wait_for_dev() would work only if async resume is on. This would be ineffective for the sync case. How can we best deal with this ? -- Damien Le Moal Western Digital Research