Re: [PATCH v8 04/23] scsi: sd: Differentiate system and runtime start/stop management

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/10/23 22:09, Phillip Susi wrote:
> Damien Le Moal <dlemoal@xxxxxxxxxx> writes:
> 
>> system suspend/resume operations, the ATA port used to connect the
>> device will also be suspended and resumed, with the resume operation
>> requiring re-validating the device link and the device itself. In this
>> case, issuing a VERIFY command to spinup the disk must be done before
>> starting to revalidate the device, when the ata port is being resumed.
>> In such case, we must not allow the SCSI disk driver to issue START STOP
>> UNIT commands.
> 
> Why must a VERIFY be issued to spinup the disk before revalidating?

We can most likely move that VERIFY to after revalidation. That should shorten
delays on first access after resume as many drive do actually spinup with the
reset done before revalidating (but note that the specs say that even a reset
shall not take a drive out of standby mode, without specifying the reset type,
so this is rather loosely defined).

> Before these patches, by default, manage_start_stop was on, and so sd
> would cause a VERIFY in the system resume path.  That resume however (
> sd and its issuing START UNIT ), would have happened AFTER the link was
> resumed and the ATA device was revalidated, woudldn't it?  So at that

In theory, yes, that was the intent. In practice, the verify was issued from
scsi PM resume context while the actual drive port reset + revalidation is done
in libata EH context, triggered from ATA port resume context which itself was
not synchronized/ordered with the scsi disk resume. So we ended up with the
verify command execution sometimes being attempted with the drive not even
revalidated yet, or with the port/link not even active sometimes (depending on
timing). So problems all over and deadlocks due to scsi revalidate using the
device lock, which PM use too.

> point, the drive would already be spinning.  And if manage_start_stop
> was disabled, then there would be no VERIFY at all, and this did not
> seem to cause a problem before.

See above. With the switch to async PM ops in scsi in kernel 5.16, things broke
badly due to the lack of synchronization that sync PM provided before that.

> 
> If this VERIFY were skipped, the next thing that would happen is for
> ata_dev_revalidate() to issue IDENTIFY, which would wait for the drive
> to spin up before returning wouldn't it? ( unless the drive has PuiS
> enabled ).

ACS defines that only media access commands can get a drive out of standby mode
back into active mode. So an IDENTIFY command would not (normally) spinup a
drive. Nor would READ LOG. Normally, IDENTIFY, READ LOG etc done in revalidate
can be processed with the drive spun down (*but* I have seen drives that react
badly to some of these commands when spun down). Hence why I put it first,
before revalidate. Now that all the synchronization is in place and libata EH
does its own manage_start_stop for system suspend/resume, I will see if moving
the verify to the end of revalidate works.


-- 
Damien Le Moal
Western Digital Research




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux