James Bottomley wrote: > On Thu, 2010-04-01 at 15:44 +0200, Hannes Reinecke wrote: >> Hazard testing uncovered yet another bug in sd. Under heavy >> reset activity the retry counter might be exhausted and >> the command will be returned with sense UNIT_ATTENTION/0x29/00 >> (POWER ON, RESET, OR BUS DEVICE RESET OCCURRED). In those >> cases we should just increase the retry counter again, >> retrying one more to clear up this Unit Attention state. >> >> Signed-off-by: Hannes Reinecke <hare@xxxxxxx> >> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index 1962bea..7d75a21 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -1454,8 +1454,15 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp, >> if (media_not_present(sdkp, &sshdr)) >> return -ENODEV; >> >> - if (the_result) >> + if (the_result) { >> sense_valid = scsi_sense_valid(&sshdr); >> + if (sense_valid && >> + sshdr.sense_key == UNIT_ATTENTION && >> + sshdr.asc = 0x29 && sshdr.asq == 0x00) > ^^^^ > should be == > >> + /* Device reset might occur several times, >> + * give it one more chance */ >> + retries++; >> + } > > Firstly, not even compile checked: > > drivers/scsi/sd.c: In function ‘read_capacity_10’: > drivers/scsi/sd.c:1558: error: ‘struct scsi_sense_hdr’ has no member named ‘asq’ > D'oh. > Secondly, we can't quite do this. Some devices (only broken ones in my > experience) will reply UNIT_ATTENTION I was RESET forever, leading to a > loop here. Additionally, a massive reset storm on a shared bus would > DoS the code here, so there must be a give up point after a reasonable > number of retries. > Hmm. yes. > The third problem is that if this is happening to a large device, we > only catch it in RC10 ... so we'll report undersize if the device is > > SPC2 > Okay. In the best of all worlds we would have a module parameter which would us to adjust this parameter, as I fear the actual number of retries will depend on the number of devices connected. But if you fell that's overkill it's fine by me, too. > How about this instead? > Yes, that's better. Thanks. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html