Re: [PATCH] libmultipath: update INFINIDAT builtin config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 29 Aug 2017, at 2:02, Xose Vazquez Perez <xose.vazquez@xxxxxxxxx> wrote:

On 08/28/2017 03:59 PM, Arnon Yaari wrote:

On 25 Aug 2017, at 3:26, Xose Vazquez Perez <xose.vazquez@xxxxxxxxx> wrote:

On 08/22/2017 03:37 PM, Arnon Yaari wrote:

- .pgfailback    = -FAILBACK_IMMEDIATE,
+ .pgfailback    = 30,

Why is it needed?

All the values suggested here are the values that the company’s interoperability
team found work best for the product on Linux. The timeout values are there
because in some upgrade scenarios we don’t want to fail devices until the
upgrade finishes.

Were they tested with a *current* kernel and multipath-tools?

Of course. We do extensive interoperability testing on RHEL, CentOS, Ubuntu, SLES, on old and all new versions.
The way our "hot upgrade" process works can cause device loss if these values are not in place.


I don’t think there is a reason to add in-line comments in the code to
explain the values (none of the other vendors are doing that).

It should be documented in the body of the git patch(changelog).

Anything other than default values must be documented.
This is a standard request: https://marc.info/?t=149850399700003

All vendor config commits I see just link to the vendor support docs, which makes sense because the vendor config is the authority on what needs to be set, and the docs themselves are the explanation (perhaps we need to add the reasons to the docs...).
I will change the patch message anyway.


+ .selector      = "round-robin 0",

round-robin is the dumbest, queue-length is smarter and
service-time is the smartest and default selector.

Regardless of which algorithm is “dumbest” or “smartest”, round-robin
is a legitimate path selection algorithm that the vendor decided works
best with its product. The vendor knows its product best :)
The product’s customers set the values suggested here (manually or
using tools provided by the vendor), whether they are upstream or not.

In an *ideal environment* , "round-robin" "queue-length" and "service-time"
work the same way, doing a round robin distribution.

But when things get ugly, "round-robin" could lost data.

"service-time" works much better in any environment, even asymmetrical.
In highly congested links, with different link speeds, faulty paths, ...

path_selector is like TCP congestion control algorithm.

We may decide to change this behavior in the future. I will pass your input to our product team. I don’t expect it’s something we’ll change in a short time frame without more discussion and testing.
We don’t see how round-robin can lose data. Can you please elaborate?
In any case, this is our current configuration that is set in customer environments. For better or worse, not accepting this doesn’t change anything other than inconveniencing users.


+ .no_path_retry = NO_PATH_RETRY_FAIL,

Default value.

Isn’t the default NO_PATH_RETRY_UNDEF?


Practically it's the same: https://marc.info/?l=dm-devel&m=147854542623733

I’m not convinced it’s the same. The message you linked says it takes what is in the “features” line, which can be changed.
Even if it’s implicitly the same, it’s not really the default. We prefer to specify this explicitly.



Thank you.

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux