On Mon, Jan 17 2011 at 10:52am -0500, Hannes Reinecke <hare@xxxxxxx> wrote: > On 01/14/2011 06:16 PM, Mike Snitzer wrote: > > On Fri, Jan 14 2011 at 11:10am -0500, > > Jonathan McDowell <noodles@xxxxxxxx> wrote: > >> > >> I'd have viewed a reservation conflict as being tied to a particular > >> path, rather than the entire target. I've seen multipath setups where > >> there are reservation issues on some of the paths but others are fine > >> and this is expected (eg use of reservations to fence off particular > >> paths). > > > > Very good point (as I think you're correct). Technically a reservation > > conflict is retryable across _different_ paths but (relative to the > > error path as it relates to multipath) it appears Hannes elected to go > > with the conservative approach of always failing the IO upward given the > > potential for data corruption when queue_if_no_path is used. > > > > Hannes previously touched on this here: > > https://www.redhat.com/archives/dm-devel/2009-November/msg00190.html > > > > "This also solves a potential data corruption with multipathing > > and persistent reservations. When queue_if_no_path is active > > multipath will queue any I/O failure (including those failed > > with RESERVATION CONFLICT) until the reservation status changes. > > But by then I/O might have been ongoing on the other paths, > > thus the delayed submission will severely corrupt your data." > > > > Even in the context of that older SCSI sense-based mpath patchset a > > reservation conflict would always fail upward (regardless of path count > > and/or queue_if_no_path). > > > > All said, the above doesn't excuse what seems to be a mis-categorization > > of reservation conflict as a pure non-retryable TARGET_FAILURE > > (EREMOTEIO). > > > Ho-hum. > > Yes, and no. > > Yes, it is correct that persistent reservations are in fact per > ITL nexus, and hence might yield different responses if retried on > another path. > > And no, it is not entirely correct to return the standard EIO error > here as then the no_path_retry mechanism might kick in and we're > back to square one. > > That said we probably need to invent another error code with > meaning 'Retry on other ITL nexus if present, but ignore no_path_retry'. That sounds right. So something like the following?: - set ITL_NEXUS_ERROR/DID_ITL_NEXXUS_FAILURE in scsi (comparable to how you did TARGET_ERROR/DID_TARGET_FAILURE) - then return -EAGAIN from __scsi_error_from_host_byte() to signal to upper layer(s) that a retry could be worthwhile -- driver specific In mpath's case it can respond to -EAGAIN by conversatively retrying with the 'Retry on other ITL nexus if present, but ignore no_path_retry' semantic? (overloading -EAGAIN leaves something to be desired but I welcome other ideas). Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html