On Wed, 2023-05-24 at 18:21 -0500, Benjamin Marzinski wrote: > For multipath devices with path group policies other than > group_by_prio, > multipathd wasn't updating all the paths' priorities when calling > need_switch_pathgroup(), even in cases where it likely was necessary. > When a path just becomes usable again, all paths' priorities get > updated > by update_prio(). But if the priority changes on a path that is > already > up, the other paths' priorities only get updated if the multipath > device > uses the group_by_prio path_grouping_policy, otherwise > need_switch_pathgroup() is called with refresh set to 0. But if the > priority of the checked path has changed, then likely so have the > priorities of other paths. Since the pathgroup's priority is the > average > of its paths' priorities, changing the priority of just one path may > not > change the average enough to reorder the pathgroups. > > Instead, set refresh in need_switch_pathgroup() if the priorty has > changed to something other than PRIO_UNDEF (which usually means an > error > has occured) and the priorities of the other paths haven't already > been > updated by update_prio(). > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > --- > multipathd/main.c | 35 +++++++++++++++++++++-------------- > 1 file changed, 21 insertions(+), 14 deletions(-) > > diff --git a/multipathd/main.c b/multipathd/main.c > index bdeffe76..e7c272ad 100644 > --- a/multipathd/main.c > +++ b/multipathd/main.c > @@ -2575,20 +2575,27 @@ check_path (struct vectors * vecs, struct > path * pp, unsigned int ticks) > > if (marginal_changed) > reload_and_sync_map(pp->mpp, vecs, 1); > - else if (update_prio(pp, new_path_up) && > - (pp->mpp->pgpolicyfn == (pgpolicyfn *)group_by_prio) && > - pp->mpp->pgfailback == -FAILBACK_IMMEDIATE) { > - condlog(2, "%s: path priorities changed. reloading", > - pp->mpp->alias); > - reload_and_sync_map(pp->mpp, vecs, !new_path_up); > - } else if (need_switch_pathgroup(pp->mpp, 0)) { > - if (pp->mpp->pgfailback > 0 && > - (new_path_up || pp->mpp->failback_tick <= 0)) > - pp->mpp->failback_tick = > - pp->mpp->pgfailback + 1; > - else if (pp->mpp->pgfailback == -FAILBACK_IMMEDIATE > || > - (chkr_new_path_up && > followover_should_failback(pp))) > - switch_pathgroup(pp->mpp); > + else { > + int prio_changed = update_prio(pp, new_path_up); > + bool need_refresh = (!new_path_up && prio_changed && > + pp->priority != PRIO_UNDEF); > + I have always found it confusing that we recalculate the priorities in two functions (update_prio() and need_switch_pathgroup()), passing boolean flags back and forth. IMO we should move this logic to update_prio(), so that we don't need to refresh any priorities in need_switch_pathgroup() any more. after determining the prio of the "primary" path device, update_prio() has all the information it needs to figure out whether priorities of other paths must be refreshed. That would even make the code easier to understand, IMO. Regards Martin > + if (prio_changed && > + pp->mpp->pgpolicyfn == (pgpolicyfn > *)group_by_prio && > + pp->mpp->pgfailback == -FAILBACK_IMMEDIATE) { > + condlog(2, "%s: path priorities changed. > reloading", > + pp->mpp->alias); > + reload_and_sync_map(pp->mpp, vecs, > !new_path_up); > + } else if (need_switch_pathgroup(pp->mpp, > need_refresh)) { > + if (pp->mpp->pgfailback > 0 && > + (new_path_up || pp->mpp->failback_tick <= > 0)) > + pp->mpp->failback_tick = > + pp->mpp->pgfailback + 1; > + else if (pp->mpp->pgfailback == - > FAILBACK_IMMEDIATE || > + (chkr_new_path_up && > + followover_should_failback(pp))) > + switch_pathgroup(pp->mpp); > + } > } > return 1; > } -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel