Re: [PATCH 3/4] scsi: Internally retry scsi_execute commands

Martin Wilck <mwilck@xxxxxxxx> · Thu, 11 Aug 2022 11:56:20 +0200

On Wed, 2022-08-10 at 12:06 -0500, Mike Christie wrote:
> On 8/10/22 5:46 AM, Martin Wilck wrote:
> > Hi Mike,
> > 
> > On Tue, 2022-08-09 at 22:41 -0500, Mike Christie wrote:
> > > In several places like LU setup and pr_ops we will hit the
> > > noretry
> > > code
> > > path because we do not retry any passthrough commands that hit
> > > device
> > > errors even though scsi-ml thinks the command is retryable.
> > > 
> > > This has us only fast fail commands that hit device errors that
> > > have
> > > been
> > > submitted through the block layer directly and not via
> > > scsi_execute.
> > > This
> > > allows SG IO and other users to continue to get fast failures and
> > > all
> > > device errors returned to them, and scsi_execute users will get
> > > their
> > > retries they had requested.
> > > 
> > > Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
> > 
> > Thanks a lot. I like the general approach, but I am concerned that
> > by
> > treating every command sent by scsi_execute() or scsi_execute_req()
> > as
> > a retryable command, we may break some callers, or at least modify
> > their semantics in unexpected ways. For example, spi_execute(),
> > scsi_probe_lun(), scsi_report_lun_scan() currently retry only on
> > UA.
> > With this change, these commands will be retried in additional
> > cases,
> > without the caller noticing (see 949bf797595fc ("[SCSI] fix command
> > retries in spi_transport class"). It isn't obvious to me that this
> > is
> > correct in all affected cases. 
> Let's make sure we are on the same page, because I might agree with
> you.
> But for possible solutions we need to agree what this patch actually
> changes.
> 
> We currently have 3 places we get retries from:
> 
> 1. scsi_decide_disposition - For passthrough commands the patch only
> changes
> the behavior when scsi_decide_disposition gets NEED_RETRY, retries <
> allowed,
> and REQ_FAILFAST_DEV is not set.
> 
> It's really specific and not as general as I think you thought it
> was.

I was aware of it, but I confused matters for some cases :-/

> 2. scsi_io_completion - Passthrough commands are never retried here.
> 
> 3. scsi_execute users driving retries.

There's also the error handler retrying commands e.g. after a
successful abort. This is the DID_TIME_OUT case, actually -
scsi_decide_disposition() never calls scsi_noretry_cmd() for
DID_TIME_OUT.

> For your examples above:
>  
> - The scan/probe functions ask for 3 retries and so with patch1 we
> will now
> get 3 x 3 retries for errors that hit #1.
> 
> So I agree this is really wrong for DID_TIME_OUT.
> 
> - There is no behavior change for spi because it uses
> REQ_FAILFAST_DEV.

Overlooked that, thanks.

> > 
> > Note that the retry logic of the mid level may change depending on
> > the
> > installed device handler. For example, ALUA devices will endlessly
> > retry UA with ASC=29, which some callers explicitly look for.
> 
> There is no behavior change with my patch for ASC=29 case, because it
> uses ADD_TO_MLQUEUE and we don't run scsi_noretry_cmd for that error.

Right, thanks for pointing that out. I saw it but didn't realize what
it meant.

> It could change how 0x04/0x0a is handled because it uses NEEDS_RETRY.
> However, scsi_dh_alua uses REQ_FAILFAST_DEV so we do not retry in
> scsi_noretry_cmd like before.

Not quite following you here - alua_check_sense() is called for any
command, not just those submitted from the ALUA code.
> 

> > I believe we need to review all callers that have their own retry
> > loop
> > and /or error handling logic. This includes sd_spinup_disk(),
> > sd_sync_cache(), scsi_test_unit_ready(). SCSI device handlers treat
> > some sense keys in special ways, or may retry commands on another
> > member of a port group (see alua_rtpg()).
> 
> There is no change in behavior for the alua one but agree with the
> general
> comment.
> 
> > 
> > DID_TIME_OUT is a general concern - no current caller of
> > scsi_execute()
> > expects timed-out commands to be retried, and doing so has the
> > potential of slowing down operations a lot. I am aware that my
> > recent
> > patch changed exactly this for scsi_probe_lun(), but doing the same
> > thing for every scsi_execute() invocation sounds at least somewhat
> > dangerous.
> 
> Agree this patch is wrong:
> - With patch1 that fixes scsi_cmnd->allowed we can end up with N and
> M
> DID_TIME_OUT retries and that can get crazy. So agree with that.
> 
> - For the general idea of do we always want to retry DID_TIME_OUT, I
> can
> I also see your point.
> 
> - After reading your mail and thinking about patch 4, I was thinking
> that
> this is wrong for patch 4 as well. For the pr_ops case we want the
> opposite
> of what you were mentioning in here. I actually want scsi-ml to retry
> all UAs for me or allow me to retry all UAs and not just handle the
> specific
> PGR related ones.
> 
> I'm not sure where to go from here:
> 
> 1. Just have the callers drive extra retries like we do now.
> 
> I guess I've come around and are ok with this.

Me too, but I'm not sure about Christoph.

> 
> With patch1, scsi_cmnd->allowed and scsi_execute works for me, so my
> previous comment about scsi_probe_lun needing to retry that case does
> not apply since scsi-ml will do it for me.
> 
> I do think we need to retry your case in other places though.
> 
> 2. Instead of trying to make it general for all scsi_execute_users,
> we can
> add SCMD bits for specific cases like DID_TIME_OUT or a SCMD bit that
> tells
> scsi_noretry_cmd to not always fail passthrough commands just because
> they
> are passthrough. It would work the opposite of the FASTFAIL bits
> where instead
> of failing fast, we retry.
> 
> I think because the cases scsi_noretry_cmd is used for are really
> specific cases
> (scsi_decide_disposition sees NEEDS_RETRY, retries < allowed, and
> REQ_FAILFAST_DEV
> is not set) that might not be very useful. 

I don't think it's _that_ speficic. (retries < allowed) is the default
case, at least for the first failure. REQ_FAILFAST_DEV has very few
users except for the device handlers, and NEEDS_RETRY is a rather
frequently used disposition.

> I'm not sure we want to add a bunch of
> cases specific to scsi_execute callers to scsi_check_sense? I don't
> think we do.

If we don't want to fall back to 1. above, I think we can go a long way
with two additional flags for scsi_execute() callers:

 a) indicate whether DID_TIME_OUT should be retried by the error
handler (default: no),
 b) indicate whether UA (and NOT READY?) should be retried on the ML
(default: current behavior)

The rationale for b) is that (IIRC from yesterday's code inspection all
callers that check sense codes only define special cases for UA, or
some subcases of it.

Maybe we need also

 c) indicate whether the (don't) retry logic for SG_IO should be used.

> To be clear, I mean for this approach to be generally useful we would
> have to
> add the cases scsi_execute callers are retrying in scsi_check_sense.
> We could
> then remove the scsi_execute caller driven retries and only have
> scsi_decide_disposition
> retry.
> 
> It sounded really nice at first, but to do that I would have a bunch
> of cases
> that might be specific to pr_ops, alua or scanning, etc. Then I also
> had cases where
> user1 does not want to retry but user2 did, so I have to add more
> SCMD bits. So, in
> the end it will get messier than you initially think.
> 

I agree. I thought about passing a lookup table of sense keys (not) to
be retried to the ML, but it feels over-engineered to me. 

Thanks
Martin