Re: [PATCH 1/7] libmultipath: Add max_retries config option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 08, 2023 at 03:36:14PM +0000, Martin Wilck wrote:
> On Thu, 2023-11-02 at 18:15 -0400, Benjamin Marzinski wrote:
> > This option lets multipath set a scsi disk's max_retries sysfs value.
> > Setting this can be helpful for cases where the path checker
> > succeeds,
> > but IO commands hang and timeout. By default, the SCSI layer will
> > retry
> > IOs 5 times. Reducing this value will allow multipath to retry the IO
> > down another path sooner.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx>
> 
> 2 nitpicks below. Please explain to me again why we recommend to
> activate shaky paths detection with this. What will go wrong if the
> user uses max_retries without shaky path detection?

In the case were the path_checker keeps succeeding, but the IOs keep
hanging, multipathd will just keep restoring this path over and over
again. That's the sort of path ping-ponging that shaky path detection
should be able to stop. I guess this is can speed up other cases for
failover as well, so I can leave that off.  Hopefully people know that
if they are seeing ping-ponging, shaky path detection can help with it.

> 
> > ---
> >  libmultipath/config.c         |  3 +++
> >  libmultipath/config.h         |  3 +++
> >  libmultipath/dict.c           | 34 ++++++++++++++++++++++++++++
> >  libmultipath/discovery.c      | 42
> > ++++++++++++++++++++++++++++++++++-
> >  libmultipath/propsel.c        | 18 +++++++++++++++
> >  libmultipath/propsel.h        |  1 +
> >  libmultipath/structs.h        |  7 ++++++
> >  multipath/multipath.conf.5.in | 22 ++++++++++++++++++
> >  8 files changed, 129 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> > index 044067af..fc438947 100644
> > --- a/libmultipath/dict.c
> > +++ b/libmultipath/dict.c
> > @@ -1152,6 +1152,36 @@ declare_hw_snprint(eh_deadline,
> > print_undef_off_zero)
> >  declare_pc_handler(eh_deadline, set_undef_off_zero)
> >  declare_pc_snprint(eh_deadline, print_undef_off_zero)
> >  
> > +static int
> > +set_max_retries(vector strvec, void *ptr, const char *file, int
> > line_nr)
> > +{
> > +	char * buff;
> > +	int *int_ptr = (int *)ptr;
> > +
> > +	buff = set_value(strvec);
> > +	if (!buff)
> > +		return 1;
> > +
> > +	if (strcmp(buff, "off") == 0)
> > +		*int_ptr = UOZ_OFF;
> > +	else if (strcmp(buff, "0") == 0)
> > +		*int_ptr = UOZ_ZERO;
> 
> Why not use MAX_RETRIES_OFF and MAX_RETRIES_ZERO here?

No good reason. I can change that.

> > +	else
> > +		do_set_int(strvec, int_ptr, 1, 5, file, line_nr,
> > buff);
> > +
> > +	free(buff);
> > +	return 0;
> > +}
> > +
> 
> > diff --git a/multipath/multipath.conf.5.in
> > b/multipath/multipath.conf.5.in
> > index 226d0019..41f3927e 100644
> > --- a/multipath/multipath.conf.5.in
> > +++ b/multipath/multipath.conf.5.in
> > @@ -793,6 +793,22 @@ The default is: \fB<unset>\fR
> >  .
> >  .
> >  .TP
> > +.B max_retries
> > +Specify the maximum number of times the SCSI layer will retry IO
> > commands
> > +before returning failure. Setting this can be helpful for cases
> > where the path
> > +checker succeeds, but IO commands hang and timeout. By default, the
> > SCSI layer
> > +will retry IOs 5 times. Reducing this value will allow multipath to
> > retry the IO
> > +down another path sooner. \fBNote:\fR If it is necessary to set this
> 
> Maybe you should add "[retry IO commands] for some types of SCSI
> errors". Otherwise users might get their hopes too high. IIRC the main
> cases that matter here are DID_TRANSPORT_DISRUPTED and perhaps
> DID_BUS_BUSY (don't put that in the man page) ;-)

Sure.

-Ben

> 
> 
> > value, it
> > +is also recommended to set up shaky paths detection. See "Shaky
> > paths detection"
> > +below. Valid values are
> > +\fB0\fR through \fB5\fR.
> > +.RS
> > +.TP
> > +The default is: \fB<unset>\fR
> > +.RE
> 
> 





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux