On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > If a multipath device has no_path_retry set to a number and has lost > all > paths, gone into recovery mode, and timed out, it will disable > queue_if_no_paths. After that, if one of those failed paths is > removed, > when the device is reloaded, queue_if_no_paths will be re-enabled. > When > set_no_path_retry() is then called to update the queueing state, it > will > not disable queue_if_no_paths, since the device is still in the > recovery > state, so it believes no work needs to be done. The device will > remain > in the recovery state, with retry_ticks at 0, and queueing enabled, > even though there are no usable paths. > > To fix this, in set_no_path_retry(), if no_path_retry is set to a > number > and the device is queueing but it is in recovery mode and out of > retries with no usable paths, manually disable queue_if_no_path. > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> Thanks, this makes sense. I am just wondering if this logic should be moved to enter_recovery_mode() / leave_recovery_mode() instead. That pair of functions is also called from update_queue_mode{add,del}_path(), where the same reasoning would apply, afaics. Am I overlooking something? Regards Martin > --- > libmultipath/structs_vec.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > index 0e8a46e7..3cb23c73 100644 > --- a/libmultipath/structs_vec.c > +++ b/libmultipath/structs_vec.c > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > !mpp->in_recovery) > dm_queue_if_no_path(mpp->alias, 1); > leave_recovery_mode(mpp); > - } else if (pathcount(mpp, PATH_PENDING) == 0) > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > + /* > + * If in_recovery is set, > enter_recovery_mode does > + * nothing. If the device is already in > recovery > + * mode and has already timed out, manually > call > + * dm_queue_if_no_path to stop it from > queueing. > + */ > + if ((!mpp->features || is_queueing) && > + mpp->in_recovery && mpp->retry_tick == > 0) > + dm_queue_if_no_path(mpp->alias, 0); > enter_recovery_mode(mpp); > + } > break; > } > }