On 2005-04-21T15:13:16, Patrick Mansfield <patmans@xxxxxxxxxx> wrote: > > The most recent udm patchset has a patch by Jens Axboe and myself to > > pass up sense data / error codes in the bio so the dm mpath module can > > deal with it. > But the scmd->result is not passed back. Bear with me and my limitted knowledge of the SCSI midlayer for a second: What additional benefit would this provide over sense key/asc/ascq & the error parameter in the bio end_io path? > Better to decode the error once, and then pass that data back to the > blk layer. Decoding is device specific. So is the handling of path initialization and others. I'd rather have this consolidated in one module, than have parts of it in the mid-layer and other parts in the multipath code. Could this be handled by a module in the mid-layer which receives commands from the DM multipath layers above, and pass appropriate flags back up? Probably. (I think this is what you're suggesting.) But frankly, I prefer the current approach, which works. I don't see a real benefit in your architecture, besides spreading things out further. > > Only issue still is that the SCSI midlayer does only generate a single > > "EIO" code also for timeouts; however, that pretty much means it's a > > transport error, because if it was a media error, we'd be getting sense > > data ;-) > How does lack of sense data imply that there was no media/device error? It does not always imply that. Note the "pretty much ... ;-)". The one thing which could be improved here is that I'm not sure if an EIO w/o sense data from the SCSI mid-layer always corresponds to a timeout. Could we get EIO also for other errors? However, as you correctly state later, it's pretty safe to treat such errors as a "path error" and retry elsewhere, because if it was a false failure, the path checker will reinstate soonish. > timeout could be a failure anywhere, in the transport or because of > target/media/LUN problems. Or not a real error at all, just a busy device > or too short a timeout setting. Well, the not real errors might benefit from the IO being retried on another path though. > Does path checker take paths permanently offline after multiple failures? The path checker lives in user-space, and that's policy ;-) So, from the kernel perspective, it doesn't matter. User-space currently does not 'permanently' fail paths, but it could be modified to do so if it goes up/down at a too high rate, basically dampening for stability. Patches welcome. > So though I don't like the approach: distinguishing timeouts or ensuring > that path checker won't continually reenable a path might be good enough, > as long as there are no other error cases (driver or SCSI) that could lead > to long lasting failures. That's essentially what is being done. However, there's some more special cases (like a storage array telling us that that service processor is no longer active and we should switch not to another path on the same, but to the other SP; which we model in dm-mpath via different priority groups and causing a PG switch), and some errors translate to errors being immediately propagated upwards (media error, illegal request, data protect and some others; again, this might include specific handling based on the storage being addressed), because for these retrying on another path (or switching service processors) doesn't make any sense or might be even harmful. > Yes, but that doesn't mean we should decode SCSI sense or scsi core error > errors (i.e. scmd->result) in dm space. This happens in the SCSI layer; dm-mpath only sees already 'decoded' sense key/asc/ascq. > Also, non-scsi drivers would like to use dm multipath, like DASD. Using > extended blk errors allows simpler support for such devices and drivers. Sure. The bi_error field introduced by Axboe's patch has flags detailing what kind of error information is available - it's either ERRNO (basically, the current "error"), SENSE (for certain scsi requests, where sense is available), and could be extended to include a DASD class, and then be complemented by a dm-dasd module for hw-specific handling for any other specific needs they might have. Can you sketch/summarize your suggested design in more detail? That would be helpful for me, because I missed parts of the earlier discussion. Sincerely, Lars Marowsky-Brée <lmb@xxxxxxx> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html