On Wed, Aug 25 2010 at 4:00am -0400, Kiyoshi Ueda <k-ueda@xxxxxxxxxxxxx> wrote: > > I'm not sure how to proceed here. How much work would > > discerning between transport and IO errors take? If it can't be done > > quickly enough the retry logic can be kept around to keep the old > > behavior but that already was a broken behavior, so... :-( > > I'm not sure how long will it take. We first need to understand what direction we want to go with this. We currently have 2 options. But any other ideas are obviously welcome. 1) Mike Christie has a patchset that introduce more specific target/transport/host error codes. Mike shared these pointers but he'd have to put the work in to refresh them: http://marc.info/?l=linux-scsi&m=112487427230642&w=2 http://marc.info/?l=linux-scsi&m=112487427306501&w=2 http://marc.info/?l=linux-scsi&m=112487431524436&w=2 http://marc.info/?l=linux-scsi&m=112487431524350&w=2 errno.h new EXYZ http://marc.info/?l=linux-kernel&m=107715299008231&w=2 add block layer blkdev.h error values http://marc.info/?l=linux-kernel&m=107961883915068&w=2 add block layer blkdev.h error values (v2 convert more drivers) http://marc.info/?l=linux-scsi&m=112487427230642&w=2 I think that patchset's appoach is fairly disruptive just to be able to train upper layers to differentiate (e.g. mpath). But in the end maybe that change takes the code in a more desirable direction? 2) Another option is Hannes' approach of having DM consume req->errors and SCSI sense more directly. I've refreshed Hannes' previous patchset against 2.6.36-rc2 but I haven't finished testing it yet (should be OK.. it boots, but still have FIXME to move scsi_uld_should_retry to scsi_error.c): http://people.redhat.com/msnitzer/patches/dm-scsi-sense/ Would be great if James, Hannes and others had a look at this refreshed RFC patchset. It's clearly not polished but it gives an idea of the approach. Does this look worthwhile? Follow-on work is needed to refine scsi_uld_should_retry further. Keep in mind that scsi_error.c is the intended location for this code. James, please note that I've attempted to make REQ_TYPE_FS set req->errors only for "genuine errors" by (ab)using scsi_decide_disposition: http://people.redhat.com/msnitzer/patches/dm-scsi-sense/scsi-Always-pass-error-result-and-sense-on-request-completion.patch If others think this may be worthwhile I can finish testing, cleanup the patches further, and post them. Mike -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html