Re: [PATCH 1/1] scsi_dh_alua: properly handling the ALUA transitioning state

Martin Wilck <mwilck@xxxxxxxx> · Fri, 20 May 2022 12:57:23 +0200

Brian, Martin, 

sorry, I've overlooked this patch previously. I have to say I think
it's wrong and shouldn't have been applied. At least I need more in-
depth explanation.

On Mon, 2022-05-02 at 20:50 -0400, Martin K. Petersen wrote:
> On Mon, 2 May 2022 08:09:17 -0700, Brian Bunker wrote:
> 
> > The handling of the ALUA transitioning state is currently broken.
> > When
> > a target goes into this state, it is expected that the target is
> > allowed to stay in this state for the implicit transition timeout
> > without a path failure. 

Can you please show me a quote from the specs on which this expectation
("without a path failure") is based? AFAIK the SCSI specs don't say
anything about device-mapper multipath semantics.

> > The handler has this logic, but it gets
> > skipped currently.
> > 
> > When the target transitions, there is in-flight I/O from the
> > initiator. The first of these responses from the target will be a
> > unit
> > attention letting the initiator know that the ALUA state has
> > changed.
> > The remaining in-flight I/Os, before the initiator finds out that
> > the
> > portal state has changed, will return not ready, ALUA state is
> > transitioning. The portal state will change to
> > SCSI_ACCESS_STATE_TRANSITIONING. This will lead to all new I/O
> > immediately failing the path unexpectedly. The path failure happens
> > in
> > less than a second instead of the expected successes until the
> > transition timer is exceeded.

dm multipath has no concept of "transitioning" state. Path state can be
either active or inactive. As Brian wrote, commands sent to the
transitioning device will return NOT READY, TRANSITIONING, and require
retries on the SCSI layer. If we know this in advance, why should we
continue sending I/O down this semi-broken path? If other, healthy
paths are available, why it would it not be the right thing to switch
I/O to them ASAP?

I suppose the problem you want to solve here is a transient situation
in which all paths are transitioning (some up, some down), which would
lead to a failure on the dm level (at least with no_path_retry=0). IMO
this has to be avoided at the firmware level, and if that is
impossible, multipath-tools' (no_path_retry * polling_interval) must be
set to a value that is higher than the time for which this transient
degraded situation would persist.

Am I missing something?

The way I see it, this is a problem that affects only storage from one
vendor, and would cause suboptimal behavior on most others. If you
really need this, I would suggest a new devinfo flag, e.g.
BLIST_DONT_FAIL_TRANSITIONING.

Regards,
Martin

> > 
> > [...]
> 
> Applied to 5.18/scsi-fixes, thanks!
> 
> [1/1] scsi_dh_alua: properly handling the ALUA transitioning state
>       https://git.kernel.org/mkp/scsi/c/6056a92ceb2a
>