On 8/1/23 01:13, Hannes Reinecke wrote: > On 7/31/23 02:39, Damien Le Moal wrote: >> During system resume, ata_port_pm_resume() triggers ata EH to >> 1) Resume the controller >> 2) Reset and rescan the ports >> 3) Revalidate devices >> This EH execution is started asynchronously from ata_port_pm_resume(), >> which means that when sd_resume() is executed, none or only part of the >> above processing may have been executed. However, sd_resume() issues a >> START STOP UNIT to wake up the drive from sleep mode. This command is >> translated to ATA with ata_scsi_start_stop_xlat() and issued to the >> device. However, depending on the state of execution of the EH process >> and revalidation triggerred by ata_port_pm_resume(), two things may >> happen: >> 1) The START STOP UNIT fails if it is received before the controller has >> been reenabled at the beginning of the EH execution. This is visible >> with error messages like: >> >> ata10.00: device reported invalid CHS sector 0 >> sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current] >> sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command >> sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5 >> sd 9:0:0:0: PM: failed to resume async: error -5 >> >> 2) The START STOP UNIT command is received while the EH process is >> on-going, which mean that it is stopped and must wait for its >> completion, at which point the command is rather useless as the drive >> is already fully spun up already. This case results also in a >> significant delay in sd_resume() which is observable by users as >> the entire system resume completion is delayed. >> >> Given that ATA devices will be woken up by libata activity on resume, >> sd_resume() has no need to issue a START STOP UNIT command, which solves >> the above mentioned problems. Do not issue this command by introducing >> the new scsi_device flag no_start_on_resume and setting this flag to 1 >> in ata_scsi_dev_config(). sd_resume() is modified to issue a START STOP >> UNIT command only if this flag is not set. >> > Q: As libata starts up the drive internally via reset/revalidate, why do > we have to sent START STOP UNIT for shutdown? To spin down the drive and put it to sleep. > Wouldn't it be better to disable START STOP UNIT completely for libata, > and let everything be handled via ATA command (IDLE IMMEDIATE, SLEEP) ? > Hmm? I am not sure this buys us much. Could try though, but that is a little too much changes for a bug fix for this cycle. I think we need to think about that in the context of reworking ata suspend/resume to be correct with regard to synchronizing ata device and scsi device suspend/resume with a direct device link, which we do not have right now. > Otherwise a really good idea. We've fallen into this trap several times > ourselves. > Incidentally, USB could benefit from the same mechanism ... Could look into it, but these bug reports, if any, do not come to me :) > > Cheers, > > Hannes -- Damien Le Moal Western Digital Research