Re: [PATCH] multipath-tools: update no_path_retry value for IBM/2145

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-02-13 at 00:42 +0100, Xose Vazquez Perez wrote:
> On 8/26/21 8:47 AM, Martin Wilck wrote:
>     ^^^^^^^
> It is never too late!

:-)

> Some history:
> 
> first commit 3eb8c380a :
>         {
>                 /* IBM SAN Volume Controller */
>                 .vendor        = "IBM",
>                 .product       = "2145",
>                 .getuid        = DEFAULT_GETUID,
>                 .getprio       = "mpath_prio_alua /dev/%n",
>                 .features      = "1 queue_if_no_path",
>                 .hwhandler     = DEFAULT_HWHANDLER,
>                 .selector      = DEFAULT_SELECTOR,
>                 .pgpolicy      = GROUP_BY_PRIO,
>                 .pgfailback    = -FAILBACK_IMMEDIATE,
>                 .rr_weight     = RR_WEIGHT_NONE,
>                 .no_path_retry = NO_PATH_RETRY_UNDEF,
>                 .minio         = DEFAULT_MINIO,
>                 .checker_name  = TUR,
>         },
> 
> NO_PATH_RETRY_UNDEF was removed in b7c3cf014 because it was the
> default value,
> and later "1 queue_if_no_path" was replaced by NO_PATH_RETRY_QUEUE in
> 87ea76f99

... which shows that the default has been "queue" for almost 18 years.

> IBM docs recommends:
> no_path_retry 5 # or no_path_retry "fail" for some current linux
> distros
> 
> IBM Storage FlashSystem 5200, 5000, 5100, Storwize V5100 and V5000E:
> https://www.ibm.com/docs/en/flashsystem-5x00/8.6.x?topic=system-settings-linux-hosts
> 
> IBM Storage FlashSystem 7300, 7200 and Storwize V7000:
> https://www.ibm.com/docs/en/flashsystem-7x00/8.6.x?topic=system-settings-linux-hosts
> 
> IBM FlashSystem V9000:
> https://www.ibm.com/docs/en/flashsystem-v9000/8.3.x?topic=system-settings-linux-hosts
> 
> IBM Storage FlashSystem 9500, 9200 and 9100:
> https://www.ibm.com/docs/en/flashsystem-9x00/8.6.x?topic=system-settings-linux-hosts
> 
> Therefore, we should change this value.

I tend to disagree. It's true that we usually follow vendor
recommendations. But in this case, I think the change would do more
harm than good, because we've defaulted to "queue" basically forever
for this product. Suddenly switching to a rather short no_path_retry
value might come as a unpleasant surprise for users. Users who follow
the IBM recommendations (using explicit multipath.conf settings) won't
notice the change anyway, but those who rely on our defaults might even
loose data.

In general, I believe vendors recommendations about "no_path_retry"
don't mean much. This setting doesn't depend on the properties of the
hardware, it's rather the preference of the end customer [*]. IMHO
"fail" or low numeric values of no_path_retry mainly make sense in
cluster configurations. Unfortunately, IBM gives no rationale for this
recommendation in its manuals [+].

But I'm not religious on the matter; more opinions welcome.

Martin

[*] Vendors can recommend a lower limit for no_path_retry, in the sense
"with this product, it can happen that zero paths are available for N
seconds during a firmware update", but a fixed no_path_retry value acts
as an upper limit.
[+] I suspect that the recommendations in the current IBM manuals have
just been copy/pasted from earlier ones, without much consideration.






[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux