On 2020-09-30 01:02, Hannes Reinecke wrote: > When the ALUA state indicates transitioning we should not retry > the command immediately, but rather complete the command with > BLK_STS_AGAIN to signal the completion handler that it might > be retried. > This allows multipathing to redirect the command to another path > if possible, and avoid stalls during lengthy transitioning times. > > Signed-off-by: Hannes Reinecke <hare@xxxxxxx> > --- > drivers/scsi/device_handler/scsi_dh_alua.c | 2 +- > drivers/scsi/scsi_lib.c | 5 +++++ > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > index 308bda2e9c00..a68222e324e9 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > @@ -1092,7 +1092,7 @@ static blk_status_t alua_prep_fn(struct scsi_device *sdev, struct request *req) > case SCSI_ACCESS_STATE_LBA: > return BLK_STS_OK; > case SCSI_ACCESS_STATE_TRANSITIONING: > - return BLK_STS_RESOURCE; > + return BLK_STS_AGAIN; > default: > req->rq_flags |= RQF_QUIET; > return BLK_STS_IOERR; > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index f0ee11dc07e4..b628aa0d824c 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1726,6 +1726,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, > scsi_device_blocked(sdev)) > ret = BLK_STS_DEV_RESOURCE; > break; > + case BLK_STS_AGAIN: > + scsi_req(req)->result = DID_BUS_BUSY << 16; > + if (req->rq_flags & RQF_DONTPREP) > + scsi_mq_uninit_cmd(cmd); > + break; > default: > if (unlikely(!scsi_device_online(sdev))) > scsi_req(req)->result = DID_NO_CONNECT << 16; Hi Hannes, What will happen if all remote ports have the state "transitioning"? Does the above code resubmit a request immediately in that case? Can this cause spinning with 100% CPU usage if the ALUA device handler notices the transitioning state before multipathd does? Thanks, Bart.