I would like to revisit this patch since it continues to cause fallout for us. Our best practice has always been to use the setting no_path_retry of 0 in multipath-tools.This means that our customers who have a previously working configuration file upgrade into this problem. My understanding around why this patch was not accepted the first time was because some array vendors stay in the ALUA transitioning state for a very long time. It doesn't seem to me that not failing the paths leads to a problem since the path checker and priority will protect against continually using the transioning paths, but I am not aware of the array vendor that led to this patch in the first place. If this patch is still not acceptable, can it be made acceptable with a flag allowing this behavior? Without this patch we have to reach out to all of our customers who are at risk and let them know that a change of no_path_retry to some non zero value is required before they upgrade. There is no good way to reach them all before this issue is hit and they take an unexpected outage. The solution of no_path_retry is not a perfect fit for us either. There are situations where getting to all paths down and the error bubbling up as soon as possible is expected. A distinction between the transitioning state getting there and some other state like unavailable or standby is not there. The fail path logic is the same. If the answer is that multipath-tools should handle this, a distinction in failing the path should be made to allow the multipath-tools to queue on transitioning but fail on other states to be able to retain the previous behavior without either regression mentioned above. Signed-off-by: Brian Bunker <brian@xxxxxxxxxxxxxxx> Acked-by: Krishna Kant <krishna.kant@xxxxxxxxxxxxxxx> Acked-by: Seamus Connor <sconnor@xxxxxxxxxxxxxxx> -- diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index bced42f082b0..28948cc481f9 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -1652,12 +1652,12 @@ static int multipath_end_io(struct dm_target *ti, struct request *clone, if (error && blk_path_error(error)) { struct multipath *m = ti->private; - if (error == BLK_STS_RESOURCE) + if (error == BLK_STS_RESOURCE || error == BLK_STS_AGAIN) r = DM_ENDIO_DELAY_REQUEUE; else r = DM_ENDIO_REQUEUE; - if (pgpath) + if (pgpath && (error != BLK_STS_AGAIN)) fail_path(pgpath); if (!atomic_read(&m->nr_valid_paths) && -- Brian Bunker PURE Storage, Inc. brian@xxxxxxxxxxxxxxx