On Tue, Aug 27, 2019 at 12:28:14PM +0000, Chongyun Wu wrote: > Hi Martin and Ben, > > Cloud you help to view below patch, thanks. > > From a7126e33e7eff8a985600b41b1723ee66b183586 Mon Sep 17 00:00:00 2001 > From: Chongyun Wu <wu.chongyun@xxxxxxx> > Date: Tue, 27 Aug 2019 10:23:50 +0800 > Subject: [PATCH] multipathd: "san_path_err" failure optimization > > Let san_path_err_recovery_time path unstable can be > detected and not reinstate it until this path keep up in > san_path_err_recovery_time. It will fix heavy IO delay > caused by parts of paths state shaky in multipath device. > > Test and result: > Run up eth1 30s and down eth1 30s with 100 loops script to > make some paths shaky in each multipath devices. > Using below multipath.conf configure in defaults section: > san_path_err_recovery_time 30 > san_path_err_threshold 2 > san_path_err_forget_rate 6 > After test, not found any IO delay logs except several logs in the very > beginning which before san_path_err filter shaky path works . > If without above config and this patch there will be lots of IO delay > in syslog and some paths state change from up to down again and again. > > Signed-off-by: Chongyun Wu <wu.chongyun@xxxxxxx> Reviewed-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > --- > multipathd/main.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/multipathd/main.c b/multipathd/main.c > index 7a5cd11..8acd080 100644 > --- a/multipathd/main.c > +++ b/multipathd/main.c > @@ -1897,6 +1897,18 @@ static int check_path_reinstate_state(struct path * pp) { > condlog(2, "%s : reinstating path early", pp->dev); > goto reinstate_path; > } > + > + /* If path became failed again or continue failed, should reset > + * path san_path_err_forget_rate and path dis_reinstate_time to > + * start a new stable check. > + */ > + if ((pp->state != PATH_UP) && (pp->state != PATH_GHOST) && > + (pp->state != PATH_DELAYED)) { > + pp->san_path_err_forget_rate = > + pp->mpp->san_path_err_forget_rate; > + pp->dis_reinstate_time = curr_time.tv_sec; > + } > + > if ((curr_time.tv_sec - pp->dis_reinstate_time ) > pp->mpp->san_path_err_recovery_time) { > condlog(2,"%s : reinstate the path after err recovery time", pp->dev); > goto reinstate_path; > @@ -2106,6 +2118,11 @@ check_path (struct vectors * vecs, struct path * pp, int ticks) > check_path_reinstate_state(pp)) { > pp->state = PATH_DELAYED; > return 1; > + } else if ((newstate != PATH_UP && newstate != PATH_GHOST) && > + (pp->state == PATH_DELAYED)) { > + /* If path state become failed again cancel path delay state */ > + pp->state = newstate; > + return 1; > } > > if ((newstate == PATH_UP || newstate == PATH_GHOST) && > -- > > Best Regards, > Chongyun Wu > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel