Luben Tuikov wrote: > --- On Fri, 2/1/08, Tony Battersby <tonyb@xxxxxxxxxxxxxxx> wrote: > >> Also, I disagree about treating recovered error like >> hardware/medium >> error. Recovered error is supposed to mean "the last >> command completed >> successfully, with some recovery action performed by the >> device >> server". >> > > Which then means that you agree with > commit 03aba2f7. > > I disagree only with this part of the commit: - good_bytes = (error_sector - SCpnt->request->sector) << 9; - if (good_bytes < 0 || good_bytes >= this_count) - good_bytes = 0; So it removed the sanity-check on good_bytes, which broke error handling for my out-of-spec RAID. My patch adds the check back, only doing it before the multiplication by the sector size rather than after. That is also why I wanted to add an upper-bound check, to make sure that sd_done never returned good_bytes > xfer_size, but no one else agreed with that level of paranoia. > But the definition of RECOVERED ERROR immediately > after what you quoted, adds: > "Details may be determined by examining the > additional sense bytes and the INFORMATION field." > > I guess the question is: if a disk drive returns RECOVERED ERROR with info_valid=1 and the sector number in the sense bytes, does that mean that the disk completed the command successfully and transferred all the data (and is reporting the sector number for information logging purposes only), or does it mean that it stopped reading or writing at the sector indicated in the sense data? I can't really say for sure, so I will leave the debate to others. BTW, your patch will result in sd_done returning good_bytes == 0 for the case where sense_key == RECOVERED ERROR && info_valid == 0, which I think is probably wrong. In this case I would return good_bytes == 0 for hardware/medium error and good_bytes == xfer_size for recovered error. > Thus the patch I sent to you for you to try on > your hardware. > > My hardware isn't returning "recovered error" or "no sense" sense keys; I was just trying to improve the handling of these cases while I was looking at the function. Thus, there is no point for me to test your full patch. My problem is now solved with the simplified patch I already posted. If you want to push for the RECOVERED ERROR change, then go right ahead with your own patch, but I'm done. Tony - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html