On Wed, May 31, 2023 at 04:27:25PM +0000, Martin Wilck wrote: > On Wed, 2023-05-24 at 18:21 -0500, Benjamin Marzinski wrote: > > For multipath devices with path group policies other than > > group_by_prio, > > multipathd wasn't updating all the paths' priorities when calling > > need_switch_pathgroup(), even in cases where it likely was necessary. > > When a path just becomes usable again, all paths' priorities get > > updated > > by update_prio(). But if the priority changes on a path that is > > already > > up, the other paths' priorities only get updated if the multipath > > device > > uses the group_by_prio path_grouping_policy, otherwise > > need_switch_pathgroup() is called with refresh set to 0. But if the > > priority of the checked path has changed, then likely so have the > > priorities of other paths. Since the pathgroup's priority is the > > average > > of its paths' priorities, changing the priority of just one path may > > not > > change the average enough to reorder the pathgroups. > > > > Instead, set refresh in need_switch_pathgroup() if the priorty has > > changed to something other than PRIO_UNDEF (which usually means an > > error > > has occured) and the priorities of the other paths haven't already > > been > > updated by update_prio(). > > > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > > --- > > multipathd/main.c | 35 +++++++++++++++++++++-------------- > > 1 file changed, 21 insertions(+), 14 deletions(-) > > > > diff --git a/multipathd/main.c b/multipathd/main.c > > index bdeffe76..e7c272ad 100644 > > --- a/multipathd/main.c > > +++ b/multipathd/main.c > > @@ -2575,20 +2575,27 @@ check_path (struct vectors * vecs, struct > > path * pp, unsigned int ticks) > > > > if (marginal_changed) > > reload_and_sync_map(pp->mpp, vecs, 1); > > - else if (update_prio(pp, new_path_up) && > > - (pp->mpp->pgpolicyfn == (pgpolicyfn *)group_by_prio) && > > - pp->mpp->pgfailback == -FAILBACK_IMMEDIATE) { > > - condlog(2, "%s: path priorities changed. reloading", > > - pp->mpp->alias); > > - reload_and_sync_map(pp->mpp, vecs, !new_path_up); > > - } else if (need_switch_pathgroup(pp->mpp, 0)) { > > - if (pp->mpp->pgfailback > 0 && > > - (new_path_up || pp->mpp->failback_tick <= 0)) > > - pp->mpp->failback_tick = > > - pp->mpp->pgfailback + 1; > > - else if (pp->mpp->pgfailback == -FAILBACK_IMMEDIATE > > || > > - (chkr_new_path_up && > > followover_should_failback(pp))) > > - switch_pathgroup(pp->mpp); > > + else { > > + int prio_changed = update_prio(pp, new_path_up); > > + bool need_refresh = (!new_path_up && prio_changed && > > + pp->priority != PRIO_UNDEF); > > + > > I have always found it confusing that we recalculate the priorities in > two functions (update_prio() and need_switch_pathgroup()), passing > boolean flags back and forth. IMO we should move this logic to > update_prio(), so that we don't need to refresh any priorities in > need_switch_pathgroup() any more. after determining the prio of the > "primary" path device, update_prio() has all the information > it needs to figure out whether priorities of other paths must be > refreshed. > > That would even make the code easier to understand, IMO. > > Regards > Martin So the difference in this code between when we choose to update all the paths' prios for the group_by_prio case, and when we choose to update all the paths' prios for the other pgpolicies comes down to how we treat PRIO_UNDEF. I didn't change the group_by_prio behavior. So right now, for group_by_prio devices, we will update all the paths' priorities if the checked path switches priorities to PRIO_UNDEF. My question is, "Is this the right thing to do?" In the best case, if the prioritizer fails on one path, it will fail on all the other paths in the pathgroup as well, so that they stay together. In the worst case it will fail on paths in other pathgroups, so that incorrect paths get grouped together. Granted, I'm not sure how much of a difference it makes in the worst case, since the other priorities would get checked eventually, and would get placed in the wrong group then. Perhaps it would be better to treat PRIO_UNDEF like PATH_PENDING, where we will continue to use the old priority if we get a PRIO_UNDEF result. The choices are: 1. make both the group_by_prio and the non-group_by_prio cases recheck all paths on PRIO_UNDEF (this keeps the group_by_prio behavior the same). 2. make both cases NOT recheck all paths on PRIO_UNDEF. 3. keep the destinction between the two (make update_prio() check the pgplicy, and act accordingly) 4. Make paths keep their previous priority when they would have returned PRIO_UNDEF, so we never switch to PRIO_UNDEF. All the choices except 3 seem reasonable. 1 keeps things how they are for group_by_prio. 2 leans towards moving PRIO_UNDEF paths out of their current pathgroup. 4 leans towards keeping PRIO_UNDEF paths in their current pathgroup. The other question is, what do we do for the delayed case. Right now, once we finish waiting for our delay in deferred_failback_tick(), we automatically refresh the priorities of all the devices in our need_switch_pathgroup() call. We could add an update_prio() call before it to keep this behavior, but if we are already refreshing all the paths' priorities when we need to, I'm not sure that it's necessary to do it again here. Thoughts? -Ben > > > > > + if (prio_changed && > > + pp->mpp->pgpolicyfn == (pgpolicyfn > > *)group_by_prio && > > + pp->mpp->pgfailback == -FAILBACK_IMMEDIATE) { > > + condlog(2, "%s: path priorities changed. > > reloading", > > + pp->mpp->alias); > > + reload_and_sync_map(pp->mpp, vecs, > > !new_path_up); > > + } else if (need_switch_pathgroup(pp->mpp, > > need_refresh)) { > > + if (pp->mpp->pgfailback > 0 && > > + (new_path_up || pp->mpp->failback_tick <= > > 0)) > > + pp->mpp->failback_tick = > > + pp->mpp->pgfailback + 1; > > + else if (pp->mpp->pgfailback == - > > FAILBACK_IMMEDIATE || > > + (chkr_new_path_up && > > + followover_should_failback(pp))) > > + switch_pathgroup(pp->mpp); > > + } > > } > > return 1; > > } -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel