Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 07, 2016 at 02:42:16PM +0800, peng.liang5@xxxxxxxxxx wrote:
>    Hello, Ben
> 
>    Sorry for late to reply.
> 
>    Such is the case as you said below. If fast_io_fail_tmo is off we have to
>    cap
> 
>    dev_loss_tmo at 600. So, this patch is a wrong guide and will be cause a
> 
>    kernel error.
> 
>    And one more question. Should the system limit dev_loss_tmo to 600 if 
> 
>    fast_io_fail_tmo set to 0?

No. The kernel doesn't limit dev_loss_tmo in this case. From a quick
test, it looks like setting fast_io_fail_tmo to 0 means that the scsi
layer fails the IO back immediately, without any waiting for the path to
return. This means that any value for dev_loss_tmo should be fine.

Thanks.
-Ben
 
>    Hope for your reply. Thanks.
> 
>                                     原始邮件
>    发件人:BenjaminMarzinski
>    收件人:彭亮10137102;
>    抄送人:张凯10072500;<dm-devel@xxxxxxxxxx>
>    日 期 :2016年12月02日 00:51
>    主 题 :Re:  [PATCH] libmultipath: ensure dev_loss_tmo will be
>    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> 
>    On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@xxxxxxxxxx wrote:
>    >    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
>    >    in select_fast_io_fail.
>    > 
>    >    So, multipath will not run the limited of dev_loss_tmo to 600.
> 
>    Yes, but the kernel will. With your patch installed, if I disable
>    fast_io_fail_tmo and set no_path_retry to queue, I get these messages
> 
>    Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
>    2147483647, error 22
> 
>    Because if fast_io_fail_tmo is not set, the kernel itself will bar
>    dev_loss_tmo from being above 600 seconds. Also, even if you could set
>    dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
>    never want to, because you would break multipath.
> 
>    With fast_io_fail_tmo disabled, the scsi device will never pass the
>    failed IO back up until dev_loss_tmo triggers.  This means that if you
>    lose a path on your multipath device while doing IO, you won't be able
>    to resend that IO down another path for 68 years (2147483647 seconds).
>    Also, all the synchronous checker functions will not return for 648
>    years. And during all this time these processes will be uninterruptable
>    sleep. At that point, there would be no point to even having multiple
>    paths, because you couldn't ever actually use them if one went down.
> 
>    > 
>    >    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
>    >    after multipath
>    > 
>    >    run select_fast_io_fail even if it's not set.
> 
>    This is true in the default case, but we can't rely on the default case.
>    Since we allow users to turn it off, we need to correctly configure
>    multipath when it is off.
> 
>    -Ben
> 
>    >                                     原始邮件
>    >    发件人:BenjaminMarzinski
>    >    收件人:彭亮10137102;
>    >    抄送人:<dm-devel@xxxxxxxxxx>张凯10072500;
>    >    日 期 :2016年11月29日 08:30
>    >    主 题 
>    :Re:  [PATCH] libmultipath: ensure dev_loss_tmo will be
>    >    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
>    > 
>    >    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@xxxxxxxxxx wrote:
>    >    > From: PengLiang <peng.liang5@xxxxxxxxxx>
>    >    > 
>    >    
>    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
>    >    
>    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
>    > 
>    >    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
>    >    was using this limit, since the underlying system uses it.
>    > 
>    >    -Ben
>    > 
>    >    > 
>    >    > Signed-off-by: PengLiang <peng.liang5@xxxxxxxxxx>
>    >    > ---
>    >    >  libmultipath/discovery.c | 3 ++-
>    >    >  1 file changed, 2 insertions(+), 1 deletion(-)
>    >    > 
>    >    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
>    >    > index aaa915c..05b0842 100644
>    >    > --- a/libmultipath/discovery.c
>    >    > +++ b/libmultipath/discovery.c
>    >    
>    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>    >    >                  goto out;
>    >    >              }
>    >    >          }
>    >    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
>    >    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
>    >    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>    >    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>    >    >              "fast_io_fail is not set",
>    >    >              rport_id, DEFAULT_DEV_LOSS_TMO);
>    >    > -- 
>    >    > 2.8.1.windows.1
>    > 
>    >    --
>    >    dm-devel mailing list
>    >    dm-devel@xxxxxxxxxx
>    >    https://www.redhat.com/mailman/listinfo/dm-devel
> 
>    --
>    dm-devel mailing list
>    dm-devel@xxxxxxxxxx
>    https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux