Re: [PATCH 9/9] dm path selector: Avoid that device removal triggers an infinite loop

Bart Van Assche <bart.vanassche@xxxxxxxxxxx> · Thu, 1 Sep 2016 08:22:30 -0700

On 09/01/2016 08:06 AM, Mike Snitzer wrote:
On Thu, Sep 01 2016 at 10:14am -0400,
Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx> wrote:

On 08/31/16 20:29, Mike Snitzer wrote:
On Wed, Aug 31 2016 at  6:18pm -0400,
Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote:

If pg_init_retries is set and a request is queued against a
multipath device with all underlying block devices in the "dying"
state then an infinite loop is triggered because activate_path()
never succeeds and hence never calls pg_init_done(). Fix this by
making ql_select_path() skip dying paths.

Assuming DM multipath needs to be sprinkling these dying queue checks so
deep (which I'm not yet sold on):

Same would be needed in service-time and round-robin right?

Hello Mike,

Before addressing service-time and round-robin path selectors I wanted
to make sure that we reach agreement about how to fix the queue length
path selector.

Do you have a proposal for an alternative approach to fix the infinite
loop that can be triggered during device removal?

I'm going to look closer now.  But I'd prefer to see the "dying" state
check(s) elevated to DM multipath.  Really would rather the path
selectors not have to worry about this state.

Hello Mike,

How about making blk_cleanup_queue() invoke a callback function in dm or 
dm-mpath and to use that callback function to keep track of the number 
of paths that are not in the "dying" state? That would allow to detect 
in the dm or dm-mpath driver whether or not all paths are in the dying 
state without having to modify every path selector. This is just an idea 
- there might be better alternatives.

Bart.

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel