Re: [PATCH 1/7] libmultipath: Add max_retries config option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2023-11-02 at 18:15 -0400, Benjamin Marzinski wrote:
> This option lets multipath set a scsi disk's max_retries sysfs value.
> Setting this can be helpful for cases where the path checker
> succeeds,
> but IO commands hang and timeout. By default, the SCSI layer will
> retry
> IOs 5 times. Reducing this value will allow multipath to retry the IO
> down another path sooner.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx>

2 nitpicks below. Please explain to me again why we recommend to
activate shaky paths detection with this. What will go wrong if the
user uses max_retries without shaky path detection?

> ---
>  libmultipath/config.c         |  3 +++
>  libmultipath/config.h         |  3 +++
>  libmultipath/dict.c           | 34 ++++++++++++++++++++++++++++
>  libmultipath/discovery.c      | 42
> ++++++++++++++++++++++++++++++++++-
>  libmultipath/propsel.c        | 18 +++++++++++++++
>  libmultipath/propsel.h        |  1 +
>  libmultipath/structs.h        |  7 ++++++
>  multipath/multipath.conf.5.in | 22 ++++++++++++++++++
>  8 files changed, 129 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> index 044067af..fc438947 100644
> --- a/libmultipath/dict.c
> +++ b/libmultipath/dict.c
> @@ -1152,6 +1152,36 @@ declare_hw_snprint(eh_deadline,
> print_undef_off_zero)
>  declare_pc_handler(eh_deadline, set_undef_off_zero)
>  declare_pc_snprint(eh_deadline, print_undef_off_zero)
>  
> +static int
> +set_max_retries(vector strvec, void *ptr, const char *file, int
> line_nr)
> +{
> +	char * buff;
> +	int *int_ptr = (int *)ptr;
> +
> +	buff = set_value(strvec);
> +	if (!buff)
> +		return 1;
> +
> +	if (strcmp(buff, "off") == 0)
> +		*int_ptr = UOZ_OFF;
> +	else if (strcmp(buff, "0") == 0)
> +		*int_ptr = UOZ_ZERO;

Why not use MAX_RETRIES_OFF and MAX_RETRIES_ZERO here?

> +	else
> +		do_set_int(strvec, int_ptr, 1, 5, file, line_nr,
> buff);
> +
> +	free(buff);
> +	return 0;
> +}
> +

> diff --git a/multipath/multipath.conf.5.in
> b/multipath/multipath.conf.5.in
> index 226d0019..41f3927e 100644
> --- a/multipath/multipath.conf.5.in
> +++ b/multipath/multipath.conf.5.in
> @@ -793,6 +793,22 @@ The default is: \fB<unset>\fR
>  .
>  .
>  .TP
> +.B max_retries
> +Specify the maximum number of times the SCSI layer will retry IO
> commands
> +before returning failure. Setting this can be helpful for cases
> where the path
> +checker succeeds, but IO commands hang and timeout. By default, the
> SCSI layer
> +will retry IOs 5 times. Reducing this value will allow multipath to
> retry the IO
> +down another path sooner. \fBNote:\fR If it is necessary to set this

Maybe you should add "[retry IO commands] for some types of SCSI
errors". Otherwise users might get their hopes too high. IIRC the main
cases that matter here are DID_TRANSPORT_DISRUPTED and perhaps
DID_BUS_BUSY (don't put that in the man page) ;-)



> value, it
> +is also recommended to set up shaky paths detection. See "Shaky
> paths detection"
> +below. Valid values are
> +\fB0\fR through \fB5\fR.
> +.RS
> +.TP
> +The default is: \fB<unset>\fR
> +.RE








[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux