On Wed, 2009-04-22 at 10:41 -0700, Grant Grundler wrote: > On Mon, Apr 20, 2009 at 11:05 AM, Moger, Babu <Babu.Moger@xxxxxxx> wrote: > > This patch introduces the mechanism to recover from I/O failures by re-initializing the path if the device is running on only one path. > > > > Problem: Device mapper fails the path for every I/O error. > > It does not care about the type of error. > > This is the fundamental problem. Different layers of the block IO > path have to agree on how to handle each possible type of error that > can be returned. I don't know where to find such an agreement and > think an implementation that does discriminate is needed. > > > There are certain errors which can be recovered by re-initializing the path again. I have seen this problem during my testing on rdac device handler. I have observed I/O errors when there is a change in Lun ownership. When Lun ownership changes device will return back with check condition with sense 0x05/0x94/0x01(SK/ASC/ASCQ -meaning Lun ownership changed). Currently, device mapper fails the path for this error and eventually this will lead to I/O error. We don't want to see I/O error for this reason. > > 1) This patch isn't discriminating between transport, media, or other > device errors. Wouldn't it make sense to discriminate? yes it is. But currently we do not have it. > "LUN ownership changed" sounds like some of the events possible in > multi-inititiator enviroment would want to be notified about and > perhaps even take some action (renegotiate access to > > 2) Will this result in resetting a SATA device? > I ask because device reset may result in data loss due to WCE enabled. > I just don't know the higher parts of the block SW stack and how > errors flow up the stack. The device is not hung, the I/O will come back after a while. BTW, activate doesn't do a reset, it just sends a command (in lsi rdac case, it just sends a mode select) to the controller. > > thanks, > grant > > > > > The patch will set the flag pg_init_required if the device is running on single path. The process_queued_ios will re-initialize path if required. I have tested this patch on LSI rdac handler. > > > > Signed-off-by: Babu Moger <babu.moger@xxxxxxx> > > --- > > > > --- linux-2.6.30-rc2/drivers/md/dm-mpath.c.orig 2009-04-17 16:49:33.000000000 -0500 > > +++ linux-2.6.30-rc2/drivers/md/dm-mpath.c 2009-04-17 17:09:51.000000000 -0500 > > @@ -1152,6 +1152,15 @@ static int do_end_io(struct multipath *m > > return error; > > > > spin_lock_irqsave(&m->lock, flags); > > + /* > > + * If this is the only path left, then lets try to > > + * re-initialize the PG one last time.. > > + */ > > + if (m->nr_valid_paths == 1 && m->hw_handler_name) { > > + m->pg_init_required = 1; > > + spin_unlock_irqrestore(&m->lock, flags); > > + goto requeue; > > + } > > if (!m->nr_valid_paths) { > > if (__must_push_back(m)) { > > spin_unlock_irqrestore(&m->lock, flags); > > > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html