Re: scsi_dh_alua: add missing transitioning state support

Mike Christie <michaelc@xxxxxxxxxxx> · Tue, 21 Sep 2010 16:14:14 -0500

On 09/21/2010 02:33 PM, Mike Snitzer wrote:
On Mon, Sep 20 2010 at 10:27pm -0400,
Mike Christie<michaelc@xxxxxxxxxxx>  wrote:

On 09/20/2010 10:35 AM, Mike Snitzer wrote:
Hi Hannes,

On Tue, Aug 31 2010 at 11:11am -0400,
Mike Snitzer<snitzer@xxxxxxxxxx>   wrote:

On Mon, Aug 30 2010 at  5:36am -0400,
Hannes Reinecke<hare@xxxxxxx>   wrote:

Nicholas A. Bellinger wrote:
On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
Handle transitioning in the prep_fn.
Handle transitioning in alua_rtpg's implicit alua code too.

These gaps were identified during controller failover testing of an
ALUA array.

Signed-off-by: Mike Snitzer<snitzer@xxxxxxxxxx>
---
  drivers/scsi/device_handler/scsi_dh_alua.c |   10 +++++++---
  1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 1a970a7..c1eedc5 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
  		    h->state == TPGS_STATE_STANDBY)
  			/* Useable path if active */
  			err = SCSI_DH_OK;
+		else if (h->state == TPGS_STATE_TRANSITIONING)
+			/* State transition, retry */
+			goto retry;
  		else
  			/* Path unuseable for unavailable/offline */
  			err = SCSI_DH_DEV_OFFLINED;
@@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
  	struct alua_dh_data *h = get_alua_data(sdev);
  	int ret = BLKPREP_OK;

-	if (h->state != TPGS_STATE_OPTIMIZED&&
-	    h->state != TPGS_STATE_NONOPTIMIZED) {
+	if (h->state == TPGS_STATE_TRANSITIONING)
+		ret = BLKPREP_DEFER;
+	else if (h->state != TPGS_STATE_OPTIMIZED&&
+		 h->state != TPGS_STATE_NONOPTIMIZED) {
  		ret = BLKPREP_KILL;
  		req->cmd_flags |= REQ_QUIET;
  	}
  	return ret;
-
  }


Makes sense to me..

Acked-by: Nicholas A. Bellinger<nab@xxxxxxxxxxxxxxx>

Not so fast. There are two problems with this approach:

The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
too.

And what is the problem with that?  The IO will eventually time out.

To restate as a question: even though we'll retry in alua_rtpg();
shouldn't the SCSI command eventually time out (via
scsi_attempt_requeue_command)?

That function is only in RHEL. Requests that are prepd and sent to
the scsi layer and driver would eventually timeout in
scsi_softirq_done in upstream.

alua_prep_fn prevents the IO from getting sent to the scsi layer so
we do not hit the check in scsi_softirq_done though.

That is only the case if alua_prep_fn were to return BLKPREP_DEFER
right?

Yeah. I misread the email above. Just to clarify..

For the alua case, we will always retry due the prep_fn issue I mentioned.

For the alua_rtpg() retry case you were discussing we will also always 
retry, because the timer check in 
scsi_attempt_requeue_command/scsi_softirq_done will break us from 
retrying forever in that execution of the request started by the 
submit_rtpg call. However, the added goto retry will would just end up 
starting a another execution. To handle Hannes comment I think you would 
want to add some retry/timer checks in alua_rtpg to prevent that from 
retrying forever.


2) the patch also modified alua_rtpg() so implicit ALUA would retry
    (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING
    - so why should we avoid retry for implicit but do it for explicit?

Leaving that for Hannes. I cannot think of a reason. Probably just did 
not do it.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html