On 2023/09/21 14:36, Bart Van Assche wrote: > On 9/20/23 06:54, Damien Le Moal wrote: >> If an error occurs when resuming a host adapter before the devices >> attached to the adapter are resumed, the adapter low level driver may >> remove the scsi host, resulting in a call to sd_remove() for the >> disks of the host. This in turn results in a call to sd_shutdown() which >> will issue a synchronize cache command and a start stop unit command to >> spindown the disk. sd_shutdown() issues the commands only if the device >> is not already suspended but does not check the power state for >> system-wide suspend/resume. That is, the commands may be issued with the >> device in a suspended state, which causes PM resume to hang, forcing a >> reset of the machine to recover. >> >> Fix this by not calling sd_shutdown() in sd_remove() if the device >> is not running. > > Hi Damien, > > I'd like to look into an alternative fix (after this patch series went > in) but I couldn't identify the call chain in the ATA resume code that > results in removal of the SCSI host. Can you please show me the call > chain that results in SCSI host removal if resuming fails? See the pm80xx driver for which I recently fixed a resume issue. That is how I found this problem with device removal: resuming the pm800xx HBA was failing and the driver then called scsi_remove_host() to drop the ports and that led to trying to removed sd devices that were still suspended. > > Thanks, > > Bart. > -- Damien Le Moal Western Digital Research