Re: [PATCH v8 04/23] scsi: sd: Differentiate system and runtime start/stop management

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/13/23 04:01, Phillip Susi wrote:
> Damien Le Moal <dlemoal () kernel ! org> writes:
> 
>> In theory, yes, that was the intent. In practice, the verify was issued from
>> scsi PM resume context while the actual drive port reset + revalidation is done
>> in libata EH context, triggered from ATA port resume context which itself was
>> not synchronized/ordered with the scsi disk resume. So we ended up with the
>> verify command execution sometimes being attempted with the drive not even
>> revalidated yet, or with the port/link not even active sometimes (depending on
>> timing). So problems all over and deadlocks due to scsi revalidate using the
>> device lock, which PM use too.
> 
> Yikes.
> 
>> See above. With the switch to async PM ops in scsi in kernel 5.16, things broke
>> badly due to the lack of synchronization that sync PM provided before that.
> 
> Yes, but without async PM ops, the IDENTIFY command that was not
> preceeded by a VERIFY worked just fine, right?

Yes. I rechecked the specs regarding this and there is nothing preventing
IDENTIFY from completing with the drive spun down. The only corner case is when
PUIS is enabled, in which case IDENTIFY may return incomplete data. But that is
handled already and that is not something we can get with a system
suspend/resume or runtime suspend/resume.

>> ACS defines that only media access commands can get a drive out of standby mode
>> back into active mode. So an IDENTIFY command would not (normally)
>> spinup a
> 
> Right, it won't CAUSE the drive to spin up, but if it is already in the
> process of spinning up ( due to the reset ), then the drive will finish
> spinning up before answering the IDENTIFY command.  Or do you think that
> some drives may handle the IDENTIFY wrong if they are still in the
> process of spinning up?

>From re-reading the specs and testing with all my drives, the port reset spins
up the drives and IDENTIFY completes OK before the spinup completes, so there
is no delay. I CC-ed you a couple of patches that move the VERIFY command
issuin to after revalidation (so execution of IDENTIFY, READ LOG etc). That
works well. I also added a CHECK POWER MODE command to check if sending the
verify is actually needed. And even while the disks are spinning up, I get
power mode 0xFF indicating ACTIVE state, so no need to send the VERIFY command
at all. The end result is that we get to finish the libata EH context doing the
resume well before the disk finishes spinning up (which can take 10+ seconds).

With this, the first read or write command following the resume will be delayed
until the drive finishes spinning up. But that is fine given the default 30s
tiemout and retries. I do not expect any problems with that.

-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux