Re: [PATCH] scsi: sd: add runtime pm to open / release

Martin Kepplinger <martin.kepplinger@xxxxxxx> · Wed, 29 Jul 2020 17:40:12 +0200

On 29.07.20 16:53, James Bottomley wrote:
> On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
>> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
>>> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
>>>> On 28.07.20 22:02, Alan Stern wrote:
>>>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
>>>>> wrote:
>>>>>> Hi Alan,
>>>>>>
>>>>>> Any API cleanup is of course welcome. I just wanted to remind
>>>>>> you that the underlying problem: broken block device runtime
>>>>>> pm. Your initial proposed fix "almost" did it and mounting
>>>>>> works but during file access, it still just looks like a
>>>>>> runtime_resume is missing somewhere.
>>>>>
>>>>> Well, I have tested that proposed fix several times, and on my
>>>>> system it's working perfectly.  When I stop accessing a drive
>>>>> it autosuspends, and when I access it again it gets resumed and
>>>>> works -- as you would expect.
>>>>
>>>> that's weird. when I mount, everything looks good, "sda1". But as
>>>> soon as I cd to the mountpoint and do "ls" (on another SD card
>>>> "ls" works but actual file reading leads to the exact same
>>>> errors), I get:
>>>>
>>>> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>>>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>>>> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>>>> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>>>> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00
>>>> 00 60 40 00 00 01 00
>>>
>>> This error report comes from the SCSI layer, not the block layer.
>>
>> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
>> CHANGED" so it sounds like it something we should be
>> ignoring.  Usually this signals a problem, like you changed the
>> medium manually (ejected the CD).  But in this case you can tell us
>> to expect this by setting
>>
>> sdev->expecting_cc_ua
>>
>> And we'll retry.  I think you need to set this on all resumed
>> devices.
> 
> Actually, it's not quite that easy, we filter out this ASC/ASCQ
> combination from the check because we should never ignore medium might
> have changed events on running devices.  We could ignore it if we had a
> flag to say the power has been yanked (perhaps an additional sdev flag
> you set on resume) but we would still miss the case where you really
> had powered off the drive and then changed the media ... if you can
> regard this as the user's problem, then we might have a solution.
> 
> James
>  

oh I see what you mean now, thanks for the ellaboration.

if I do the following change, things all look normal and runtime pm
works. I'm not 100% sure if just setting expecting_cc_ua in resume() is
"correct" but that looks like it is what you're talking about:

(note that this is of course with the one block layer diff applied that
Alan posted a few emails back)


--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
                 * so that we can deal with it there.
                 */
                if (scmd->device->expecting_cc_ua) {
-                       /*
-                        * Because some device does not queue unit
-                        * attentions correctly, we carefully check
-                        * additional sense code and qualifier so as
-                        * not to squash media change unit attention.
-                        */
-                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
-                               scmd->device->expecting_cc_ua = 0;
-                               return NEEDS_RETRY;
-                       }
+                       scmd->device->expecting_cc_ua = 0;
+                       return NEEDS_RETRY;
                }
                /*
                 * we might also expect a cc/ua if another LUN on the target

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..5ad847fed8b9 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
        if (!sdkp)      /* E.g.: runtime resume at the start of
sd_probe() */
                return 0;

+       sdkp->device->expecting_cc_ua = 1;
+
        if (!sdkp->device->manage_start_stop)
                return 0;