On Thu, 2018-12-20 at 16:17 +0100, Hannes Reinecke wrote: > + > > +enum { > > + ANA_PRIO_OPTIMIZED = 50, > > + ANA_PRIO_NONOPTIMIZED = 10, > > + ANA_PRIO_INACCESSIBLE = 5, > > + ANA_PRIO_PERSISTENT_LOSS = 1, > > + ANA_PRIO_CHANGE = 0, > > + ANA_PRIO_RESERVED = 0, > > + ANA_PRIO_GETCTRL_FAILED = -1, > > + ANA_PRIO_NOT_SUPPORTED = -2, > > + ANA_PRIO_GETANAS_FAILED = -3, > > + ANA_PRIO_GETANALOG_FAILED = -4, > > + ANA_PRIO_GETNSID_FAILED = -5, > > + ANA_PRIO_GETNS_FAILED = -6, > > + ANA_PRIO_NO_MEMORY = -7, > > + ANA_PRIO_NO_INFORMATION = -8, > > +}; > > Please model the priorities according to the ALUA handler; ANA state > 'persistent loss' maps onto ALUA 'unavailable' (and hence should have > a > priority of '0'), and ANA state 'inaccessible' is roughly similar to > ALUA 'standby', hence should have a priority of '1'. Will do. But please note that, in contrast to what we discussed off- list, a priority of "0" has no special meaning. In particular, pathgroup priority "0" (or negative!) doesn't imply that the PG in question can't be selected for I/O. The only thing that is "special" about priority 0 is that multipathd assigns this prio to PGs that have no working paths. Therefore, a PG to which the prioritizer assigns prio <= 0 will not be *preferred* over such a zero-path PG. The only way to avoid that the kernel select a particular PG is to set all paths in the PG to failed state, or to remove it altogether. multipathd could try to set the PG to "disabled" state, but currently it doesn't, and if it did, it wouldn't have the expected effect, because "disabled" really just means "bypassed" in device mapper. A "bypassed" PG will be selected for I/O if no other PG has healthy paths. (Side note: "bypassed" might actually be a reasonable PG state to use for a PG consisting only of GHOST paths, but we don't do that today). Therefore, I think that it makes sense to add an "ana path checker" to multipathd, which would detect NVMe paths in states not suitable for I/O and fail them in device mapper. We don't want device mapper to try these paths. I'm not quite sure about "inaccessible" state - your statement above would imply that "inaccessible" shouldn't be failed. But the way I read the ANA spec (8.19.4), simply trying I/O through "inaccessible" ports would be wrong. Rather, the path should be monitored for a transition to either "optimized" or "non-optimized" state. That matches the behavior of the kernel native NVMe multipath driver, which AFAICS never attempts I/O through any paths which aren't either "optimized" or "non-optimized", and makes no distinction between "inaccessible" and "persistent loss" states. Cheers, Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel