RE: [PATCH] dm mpath: Try recover from I/O failure by re-initializing the PG if device is running on one path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Chandra Seetharaman [mailto:sekharan@xxxxxxxxxx]
> Sent: Wednesday, April 22, 2009 12:33 PM
> To: Moger, Babu
> Cc: Kiyoshi Ueda; 'dm-devel@xxxxxxxxxx'; linux-scsi@xxxxxxxxxxxxxxx;
> Chauhan, Vijay
> Subject: RE: [PATCH] dm mpath: Try recover from I/O failure by re-
> initializing the PG if device is running on one path
> 
> You mean to say that we do not require this patch/fix ?
 

 No. I did not mean that. What I meant was re-activating the path from device handler (rdac_check_sense) is not an option. We have to find other ways to deal with it. 

Lun ownership change can happen if the user knowingly or unknowingly changes the ownership. Also our redistribute feature can change the lun ownership.


> On Wed, 2009-04-22 at 08:03 -0600, Moger, Babu wrote:
> > Hi Kiyoshi,
> >
> > > >>
> > > >> Hi Babu,
> > > >>
> > > >> On 2009/04/21 3:05 +0900, Moger, Babu wrote:
> > > >>> This patch introduces the mechanism to recover from I/O failures
> by
> > > >>> re-initializing the path if the device is running on only one
> path.
> > > >>>
> > > >>> Problem: Device mapper fails the path for every I/O error. It does
> not
> > > >>> care about the type of error. There are certain errors which can
> be
> > > >>> recovered by re-initializing the path again. I have seen this
> problem
> > > >>> during my testing on rdac device handler. I have observed I/O
> errors
> > > >>> when there is a change in Lun ownership. When Lun ownership
> changes
> > > >>> device will return back with check condition with
> > > >>> sense 0x05/0x94/0x01(SK/ASC/ASCQ -meaning Lun ownership changed).
> > > >>> Currently, device mapper fails the path for this error and
> eventually
> > > >>> this will lead to I/O error. We don't want to see I/O error for
> this
> > > >>> reason.
> > > >>
> > > >> Shouldn't we handle this type of device error inside device
> handler?
> > > >
> > > > The current error in question requires re-activation of the path.
> > > > We already have a code to handle this scenario in device handler.
> > > > But, the problem is the return status does not go to DM layer.
> > > > The return status gets lost in scsi layer. For DM layer all the
> errors
> > > > are -EIO. Any thoughts from your side.
> > >
> > > Oh, I missed the point and I thought that re-activating the path
> > > in your device handler was enough for the error.
> > > Currently, I have no idea to handle your case only in dm without
> > > seeing I/O error.
> > >
> >   I have discussed about re-activating path in device handler with
> Chandra. Looks like that will lead to other issues (one is long boot up).
> Look like that is not an option.
> > > By the way, who did change the ownership when the device was running
> > > with one path in your testing?  I can't see why such case happened.
> > >
> > This can happen if the user knowingly or unknowingly changes the
> ownership. Also we have other feature in our storage which will allow user
> to redistribute the luns. Thanks for you comment.
> >
> > > Thanks,
> > > Kiyoshi Ueda
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux