James Bottomley <James.Bottomley@xxxxxxxxxxxx> wrote: > > On Wed, 2006-04-26 at 16:27 -0400, Mark Lord wrote: > > From: Mark Lord <lkml@xxxxxx> > > > > I am looking into how SCSI/SATA handle medium (disk) errors, > > and the observed behaviour is a little more random than expected, > > due to a bug in sd.c. > > > > When scsi_get_sense_info_fld() fails (returns 0), it does NOT update the > > value of first_err_block. But sd_rw_intr() merrily continues to use that > > variable regardless, possibly making incorrect decisions about retries and the like. > > > > This patch removes the randomness there, by using the first sector of the > > request (SCpnt->request->sector) in such cases, instead of first_err_block. > > > > The patch shows more context than usual, to help see what's going on. > > Thanks for finding the bug. Your solution is a bit, um, convoluted. > What it should really be doing if we find no valid information field is > a break so we go out with the default good_sectors of zero (rather than > arriving at that value via a circuitous route). > > And, of course, I couldn't resist eliminating the superfluous info_valid > variable and tidying the logic to be programmatic instead of a switch > case. How does this work? It'd be nice to have something simple-and-obvious for the simple-and-obvious -stable maintainers. That's if we think -stable needs this fixed. > + int sector_size_div = > + 512 / SCpnt->device->sector_size; > + error_sector /= sector_size_div; You sure about this bit? - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html