On Wed, 2024-09-04 at 15:51 -0400, Benjamin Marzinski wrote: > On Wed, Sep 04, 2024 at 08:57:46PM +0200, Martin Wilck wrote: > > On Wed, 2024-08-28 at 18:17 -0400, Benjamin Marzinski wrote: > > > Instead of updating the path priorities and possibly reloading > > > the > > > multipath device when each path is updated, wait till all paths > > > have been updated, and then go through the multipath devices > > > updating > > > the priorities once, reloading if necessary. > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > > > --- > > > libmultipath/structs.h | 9 +++ > > > multipathd/main.c | 169 ++++++++++++++++++++++++++--------- > > > ---- > > > -- > > > 2 files changed, 118 insertions(+), 60 deletions(-) > > > > > > diff --git a/libmultipath/structs.h b/libmultipath/structs.h > > > index af8e31e9..1f531d30 100644 > > > --- a/libmultipath/structs.h > > > +++ b/libmultipath/structs.h > > > @@ -318,6 +318,7 @@ enum check_path_states { > > > CHECK_PATH_UNCHECKED, > > > CHECK_PATH_STARTED, > > > CHECK_PATH_CHECKED, > > > + CHECK_PATH_NEW_UP, > > > CHECK_PATH_SKIPPED, > > > CHECK_PATH_REMOVED, > > > }; > > > @@ -421,6 +422,13 @@ enum prflag_value { > > > PRFLAG_SET, > > > }; > > > > > > +enum prio_update_type { > > > + PRIO_UPDATE_NONE, > > > + PRIO_UPDATE_NORMAL, > > > + PRIO_UPDATE_NEW_PATH, > > > + PRIO_UPDATE_MARGINAL, > > > +}; > > > + > > > struct multipath { > > > char wwid[WWID_SIZE]; > > > char alias_old[WWID_SIZE]; > > > @@ -464,6 +472,7 @@ struct multipath { > > > int queue_mode; > > > unsigned int sync_tick; > > > int synced_count; > > > + enum prio_update_type prio_update; > > > uid_t uid; > > > gid_t gid; > > > mode_t mode; > > > diff --git a/multipathd/main.c b/multipathd/main.c > > > index 2bcc6066..1511f701 100644 > > > --- a/multipathd/main.c > > > +++ b/multipathd/main.c > > > @@ -1996,15 +1996,13 @@ mpvec_garbage_collector (struct vectors * > > > vecs) > > > * best pathgroup, and this is the first path in the pathgroup > > > to > > > come back > > > * up, then switch to this pathgroup */ > > > static int > > > -followover_should_failback(struct path * pp) > > > +do_followover_should_failback(struct path * pp) > > > { > > > struct pathgroup * pgp; > > > struct path *pp1; > > > int i; > > > > > > - if (pp->mpp->pgfailback != -FAILBACK_FOLLOWOVER || > > > - !pp->mpp->pg || !pp->pgindex || > > > - pp->pgindex != pp->mpp->bestpg) > > > + if (pp->pgindex != pp->mpp->bestpg) > > > return 0; > > > > What happened to the !pp->pgindex test? Is it not necessary any > > more? > > In followover_should_failback(), before calling this function, we > test > if mpp->bestpg is 0 and return early if it is (because 0 is not a > valid > pathgroup id). Thus, if pp->pgindex equals mpp->bestpg (and we > should > failback), it's obviously not zero. If it's not equal to mpp->bestpg, > then it doesn't matter what it is, because we aren't doing the > failback anyway. Right? Ok, thanks for clarifying it ;-) > > > > This test used to be pp->state == PATH_DOWN (see below). > > > > Just trying to confirm that you did this on purpose. It might > > justify a > > separate patch, with rationale. After all, AFAICS, we will now keep > > the > > old priority for all path states except PATH_UP and PATH_GHOST, > > while > > previously we attempted to fetch the prio for most of them, and > > only > > used the previous value if fetching the prio failed. > > In my understanding, we don't try getting the priority of failed > paths > because some of the prioritizers can hang. On this grounds, if we are > excluding PATH_DOWN we should definitely also exclude PATH_SHAKY and > PATH_TIMEOUT and probably also PATH_DELAYED. > > Also, since only paths that are in PATH_UP or PATH_GHOST are usable > by > the kernel and contribute to the pathgroup's priority, not attempting > to > update the priority of these other paths avoids the risk of hanging > whil > not hurting anything (except that they might end up in the wrong > pathgroup during the time when they aren't usable, if you have > group_by_prio set). > > Lastly, The way do_check_path() worked before this patchset was > inconsistent. If a path changed state to anything other than PATH_UP > or > PATH_GHOST we skipped the prio update. However if the path kept the > same > state we did the prio update, even if it wasn't PATH_UP or > PATH_GHOST, > as long as it wasn't PATH_DOWN. That seems wrong. > > But I'm fine with making these changes in a separate patch. That would be great, and please include the above rationale. Martin