On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@xxxxxxxxxx wrote: > If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL > in select_fast_io_fail. > > So, multipath will not run the limited of dev_loss_tmo to 600. Yes, but the kernel will. With your patch installed, if I disable fast_io_fail_tmo and set no_path_retry to queue, I get these messages Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to 2147483647, error 22 Because if fast_io_fail_tmo is not set, the kernel itself will bar dev_loss_tmo from being above 600 seconds. Also, even if you could set dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would never want to, because you would break multipath. With fast_io_fail_tmo disabled, the scsi device will never pass the failed IO back up until dev_loss_tmo triggers. This means that if you lose a path on your multipath device while doing IO, you won't be able to resend that IO down another path for 68 years (2147483647 seconds). Also, all the synchronous checker functions will not return for 648 years. And during all this time these processes will be uninterruptable sleep. At that point, there would be no point to even having multiple paths, because you couldn't ever actually use them if one went down. > > And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless > after multipath > > run select_fast_io_fail even if it's not set. This is true in the default case, but we can't rely on the default case. Since we allow users to turn it off, we need to correctly configure multipath when it is off. -Ben > 原始邮件 > 发件人:BenjaminMarzinski > 收件人:彭亮10137102; > 抄送人:<dm-devel@xxxxxxxxxx>张凯10072500; > 日 期 :2016年11月29日 08:30 > 主 题 :Re: [PATCH] libmultipath: ensure dev_loss_tmo will be > update to MAX_DEV_LOSS_TMO if no_path_retry set to queue > > On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@xxxxxxxxxx wrote: > > From: PengLiang <peng.liang5@xxxxxxxxxx> > > > > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO. > > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile. > > Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath > was using this limit, since the underlying system uses it. > > -Ben > > > > > Signed-off-by: PengLiang <peng.liang5@xxxxxxxxxx> > > --- > > libmultipath/discovery.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c > > index aaa915c..05b0842 100644 > > --- a/libmultipath/discovery.c > > +++ b/libmultipath/discovery.c > > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp) > > goto out; > > } > > } > > - } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) { > > + } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO && > > + mpp->no_path_retry != NO_PATH_RETRY_QUEUE) { > > condlog(3, "%s: limiting dev_loss_tmo to %d, since " > > "fast_io_fail is not set", > > rport_id, DEFAULT_DEV_LOSS_TMO); > > -- > > 2.8.1.windows.1 > > -- > dm-devel mailing list > dm-devel@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel