Re: [PATCH] sd: always retry READ CAPACITY for ALUA state transition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2015-04-30 at 14:26 +0200, Hannes Reinecke wrote:
> On 04/28/2015 11:18 PM, James Bottomley wrote:
> > On Mon, 2015-04-27 at 11:35 +0200, Hannes Reinecke wrote:
> >> During ALUA state transitions the device might return
> >> a sense code 02/04/0a (Logical unit not accessible, asymmetric
> >> access state transition). As this is a transient error
> >> we should just retry the READ CAPACITY call until
> >> the state transition finishes and the correct
> >> capacity can be returned.
> >>
> >> Signed-off-by: Hannes Reinecke <hare@xxxxxxx>
> >> ---
> >>  drivers/scsi/sd.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> >> index 79beebf..7178b05 100644
> >> --- a/drivers/scsi/sd.c
> >> +++ b/drivers/scsi/sd.c
> >> @@ -1987,6 +1987,11 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >>  				 * give it one more chance */
> >>  				if (--reset_retries > 0)
> >>  					continue;
> >> +			if (sense_valid &&
> >> +			    sshdr.sense_key == NOT_READY &&
> >> +			    sshdr.asc == 0x04 && sshdr.ascq == 0x0A)
> >> +				/* ALUA state transition; always retry */
> >> +				continue;
> >>  		}
> >>  		retries--;
> >>  
> >> @@ -2069,6 +2074,11 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >>  				 * give it one more chance */
> >>  				if (--reset_retries > 0)
> >>  					continue;
> >> +			if (sense_valid &&
> >> +			    sshdr.sense_key == NOT_READY &&
> >> +			    sshdr.asc == 0x04 && sshdr.ascq == 0x0A)
> >> +				/* ALUA state transition; always retry */
> >> +				continue;
> >>  		}
> >>  		retries--;
> >>  
> > 
> > Got to say I really don't like this infinite retry possibility.  How
> > long does the ALUA transition take?  Would increasing retries work (or
> > even hijacking reset_retries)?
> > 
> Well ... transitioning could be quite long (NetApp FAS has a
> transition timeout of 30 _minutes_ ...).
> But yeah, I could see to limit this somewhat.

I think that might be a good idea.  We can't hold this device (and the
corresponding asynchronous probe thread) in a continuous loop for 30
minutes ...

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux