Re: [PATCH] ata,scsi: do not issue START STOP UNIT on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/31/23 12:48, TW wrote:
> Tested-by: Tanner Watkins

A tested-by tag must have the form:

Tested-by: Tanner Watkins <dalzot@xxxxxxxxx>

Also, please do not top-post. That is, reply with your text (this tag), below
the original message (which you can trim).

Thanks.

> 
> On 7/30/23 18:39, Damien Le Moal wrote:
>> During system resume, ata_port_pm_resume() triggers ata EH to
>> 1) Resume the controller
>> 2) Reset and rescan the ports
>> 3) Revalidate devices
>> This EH execution is started asynchronously from ata_port_pm_resume(),
>> which means that when sd_resume() is executed, none or only part of the
>> above processing may have been executed. However, sd_resume() issues a
>> START STOP UNIT to wake up the drive from sleep mode. This command is
>> translated to ATA with ata_scsi_start_stop_xlat() and issued to the
>> device. However, depending on the state of execution of the EH process
>> and revalidation triggerred by ata_port_pm_resume(), two things may
>> happen:
>> 1) The START STOP UNIT fails if it is received before the controller has
>>     been reenabled at the beginning of the EH execution. This is visible
>>     with error messages like:
>>
>> ata10.00: device reported invalid CHS sector 0
>> sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK
>> sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current]
>> sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command
>> sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5
>> sd 9:0:0:0: PM: failed to resume async: error -5
>>
>> 2) The START STOP UNIT command is received while the EH process is
>>     on-going, which mean that it is stopped and must wait for its
>>     completion, at which point the command is rather useless as the drive
>>     is already fully spun up already. This case results also in a
>>     significant delay in sd_resume() which is observable by users as
>>     the entire system resume completion is delayed.
>>
>> Given that ATA devices will be woken up by libata activity on resume,
>> sd_resume() has no need to issue a START STOP UNIT command, which solves
>> the above mentioned problems. Do not issue this command by introducing
>> the new scsi_device flag no_start_on_resume and setting this flag to 1
>> in ata_scsi_dev_config(). sd_resume() is modified to issue a START STOP
>> UNIT command only if this flag is not set.
>>
>> Reported-by: Paul Ausbeck <paula@xxxxxxxxxxxx>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215880
>> Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
>> Signed-off-by: Damien Le Moal <dlemoal@xxxxxxxxxx>
>> ---
>>   drivers/ata/libata-scsi.c  | 7 +++++++
>>   drivers/scsi/sd.c          | 9 ++++++---
>>   include/scsi/scsi_device.h | 1 +
>>   3 files changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>> index 370d18aca71e..c6ece32de8e3 100644
>> --- a/drivers/ata/libata-scsi.c
>> +++ b/drivers/ata/libata-scsi.c
>> @@ -1100,7 +1100,14 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
>>   		}
>>   	} else {
>>   		sdev->sector_size = ata_id_logical_sector_size(dev->id);
>> +		/*
>> +		 * Stop the drive on suspend but do not issue START STOP UNIT
>> +		 * on resume as this is not necessary and may fail: the device
>> +		 * will be woken up by ata_port_pm_resume() with a port reset
>> +		 * and device revalidation.
>> +		 */
>>   		sdev->manage_start_stop = 1;
>> +		sdev->no_start_on_resume = 1;
>>   	}
>>   
>>   	/*
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index 68b12afa0721..3c668cfb146d 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3876,7 +3876,7 @@ static int sd_suspend_runtime(struct device *dev)
>>   static int sd_resume(struct device *dev)
>>   {
>>   	struct scsi_disk *sdkp = dev_get_drvdata(dev);
>> -	int ret;
>> +	int ret = 0;
>>   
>>   	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
>>   		return 0;
>> @@ -3884,8 +3884,11 @@ static int sd_resume(struct device *dev)
>>   	if (!sdkp->device->manage_start_stop)
>>   		return 0;
>>   
>> -	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
>> -	ret = sd_start_stop_device(sdkp, 1);
>> +	if (!sdkp->device->no_start_on_resume) {
>> +		sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
>> +		ret = sd_start_stop_device(sdkp, 1);
>> +	}
>> +
>>   	if (!ret)
>>   		opal_unlock_from_suspend(sdkp->opal_dev);
>>   	return ret;
>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
>> index 75b2235b99e2..b9230b6add04 100644
>> --- a/include/scsi/scsi_device.h
>> +++ b/include/scsi/scsi_device.h
>> @@ -194,6 +194,7 @@ struct scsi_device {
>>   	unsigned no_start_on_add:1;	/* do not issue start on add */
>>   	unsigned allow_restart:1; /* issue START_UNIT in error handler */
>>   	unsigned manage_start_stop:1;	/* Let HLD (sd) manage start/stop */
>> +	unsigned no_start_on_resume:1; /* Do not issue START_STOP_UNIT on resume */
>>   	unsigned start_stop_pwr_cond:1;	/* Set power cond. in START_STOP_UNIT */
>>   	unsigned no_uld_attach:1; /* disable connecting to upper level drivers */
>>   	unsigned select_no_atn:1;

-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux