Re: [PATCH] scsi: dm-mpath: do not fail paths when the target returns ALUA state transition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Martin and Hannes,

Any resolution to this issue?

Thanks,
Brian

On Fri, Jul 16, 2021 at 11:39 AM Brian Bunker <brian@xxxxxxxxxxxxxxx> wrote:
>
> On Fri, Jul 16, 2021 at 1:25 AM Martin Wilck <mwilck@xxxxxxxx> wrote:
> >
> > Hannes,
> >
> > On Fr, 2021-07-16 at 08:27 +0200, Hannes Reinecke wrote:
> > > On 7/15/21 6:57 PM, Brian Bunker wrote:
> > > > When paths return an ALUA state transition, do not fail those paths.
> > > > The expectation is that the transition is short lived until the new
> > > > ALUA state is entered. There might not be other paths in an online
> > > > state to serve the request which can lead to an unexpected I/O error
> > > > on the multipath device.
> > > >
> > > > Signed-off-by: Brian Bunker <brian@xxxxxxxxxxxxxxx>
> > > > Acked-by: Krishna Kant <krishna.kant@xxxxxxxxxxxxxxx>
> > > > Acked-by: Seamus Connor <sconnor@xxxxxxxxxxxxxxx>
> > > > --
> > > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> > > > index bced42f082b0..28948cc481f9 100644
> > > > --- a/drivers/md/dm-mpath.c
> > > > +++ b/drivers/md/dm-mpath.c
> > > > @@ -1652,12 +1652,12 @@ static int multipath_end_io(struct dm_target
> > > > *ti, struct request *clone,
> > > >          if (error && blk_path_error(error)) {
> > > >                  struct multipath *m = ti->private;
> > > >
> > > > -               if (error == BLK_STS_RESOURCE)
> > > > +               if (error == BLK_STS_RESOURCE || error ==
> > > > BLK_STS_AGAIN)
> > > >                          r = DM_ENDIO_DELAY_REQUEUE;
> > > >                  else
> > > >                          r = DM_ENDIO_REQUEUE;
> > > >
> > > > -               if (pgpath)
> > > > +               if (pgpath && (error != BLK_STS_AGAIN))
> > > >                          fail_path(pgpath);
> > > >
> > > >                  if (!atomic_read(&m->nr_valid_paths) &&
> > > > --
> > >
> > > Sorry, but this will lead to regressions during failover for arrays
> > > taking longer time (some take up to 30 minutes for a complete
> > > failover).
> > > And for those it's absolutely crucial to _not_ retry I/O on the paths
> > > in
> > > transitioning.
> >
> > This won't happen.
> >
> > As argued in https://marc.info/?l=linux-scsi&m=162625194203635&w=2,
> > your previous patches avoid the request being requeued on the SCSI
> > queue, even with Brian's patch on top. IMO that means that the deadlock
> > situation analyzed earlier can't occur. If you disagree, please explain
> > in detail.
> >
> > By not failing the path, the request can be requeued on the dm level,
> > and dm can decide to try the transitioning path again. But the request
> > wouldn't be added to the SCSI queue, because alua_prep_fn() would
> > prevent that.
> >
> > So, in the worst case, dm would retry queueing the request on the
> > transitioning path over and over again. By adding a delay, we avoid
> > busy-looping. This has basically the same effect as queueing on the dm
> > layer in the first place: the request stays queued on the dm level most
> > of the time. Except for the fact that the queueing would stop earlier:
> > as soon as the kernel notices that the path is not transitioning any
> > more. By not failing the dm paths, we don't depend on user space to
> > reinstate them, which is the right thing to do for a transitioning
> > state IMO.
> >
> > > And you already admitted that 'queue_if_no_path' would resolve this
> > > problem, so why not update the device configuration in multipath-
> > > tools
> > > to have the correct setting for your array?
>
> The reason I don't like queue_if_no_path for a solution is that there
> are times we do want to tail all of the paths as quickly as possible
> (e.g. a cluster sharing a resource). If we add some non zero value to
> no_path_retry we would be forcing that configuration to unnecessarily
> wait, and those customers will see this delay as a regression. This is
> where a distinction between an ALUA state of standby or unavailable
> vs. a transition ALUA state is attractive.
>
> >
> > I think Brian is right that setting transitioning paths to failed state
> > in dm is highly questionable.
> >
> > So far, in dm-multipath, we haven't set paths to failed state because
> > of ALUA state transitions. We've been mapping ALUA states to priorities
> > instead. We don't even fail paths in ALUA "unavailable" state, so why
> > do it for "transitioning"?
> >
> > Thanks
> > Martin
> >
> >
>
> Thanks,
> Brian
>
> --
> Brian Bunker
> PURE Storage, Inc.
> brian@xxxxxxxxxxxxxxx



-- 
Brian Bunker
PURE Storage, Inc.
brian@xxxxxxxxxxxxxxx



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux