Re: [PATCH] ata: libata-scsi: Avoid deadlock on rescan after device resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 15, 2023 at 05:33:26PM +0900, Damien Le Moal wrote:
> When an ATA port is resumed from sleep, the port is reset and a power
> management request issued to libata EH to reset the port and rescanning
> the device(s) attached to the port. Device rescanning is done by
> scheduling an ata_scsi_dev_rescan() work, which will execute
> scsi_rescan_device().
> 
> However, scsi_rescan_device() takes the generic device lock, which is
> also taken by dpm_resume() when the SCSI device is resumed as well. If
> a device rescan execution starts before the completion of the SCSI
> device resume, the rcu locking used to refresh the cached VPD pages of
> the device, combined with the generic device locking from
> scsi_rescan_device() and from dpm_resume() can cause a deadlock.
> 
> Avoid this situation by changing struct ata_port scsi_rescan_task to be
> a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is
> modified to check if the SCSI device associated with the ATA device that
> must be rescanned is not suspended. If the SCSI device is still
> suspended, ata_scsi_dev_rescan() returns early and reschedule itself for
> execution after an arbitrary delay of 5ms.

I don't understand the nature of the relationship between the ATA port
and the corresponding SCSI device.  Maybe you could explain it more
fully, if you have time.

But in any case, this approach seems like a layering violation.  Why not 
instead call a SCSI utility routine to set a "needs_rescan" flag in the 
scsi_device structure?  Then scsi_device_resume() could automatically 
call scsi_rescan_device() -- or rather an internal version that assumes 
the device lock is already held -- if the flag is set.  Or it could 
queue a non-delayed work routine to do this.  (Is it important to have 
the rescan finish before userspace starts up and tries to access the ATA 
device again?)

That, combined with a guaranteed order of resuming, would do what you 
want, right?

Alan Stern



[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux