Re: [PATCH] multipathd: check and cleanup zombie paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2018-03-20 at 03:19 +0000, Chongyun Wu wrote:
> On 2018/3/20 5:42, Martin Wilck wrote:
> > On Fri, 2018-03-09 at 10:22 -0600, Benjamin Marzinski wrote:
> > > On Fri, Mar 09, 2018 at 06:47:30AM +0000, Chongyun Wu wrote:
> > > > On 2018/3/8 23:45, Benjamin Marzinski wrote:
> > > > > 
> > > > > If there are multiple routes to the storage, Some of them can
> > > > > be
> > > > > down,
> > > > > even if everything is fine on the storage.  This will cause
> > > > > some
> > > > > paths
> > > > > to be up and some to be down, regardless of the state of the
> > > > > LUN.
> > > > > In
> > > > > every other multipath case but this one, there is just one
> > > > > LUN,
> > > > > and not
> > > > > all the paths have the same state.
> > > > > 
> > > > > Ideally, there would be a way to determine if a path is a
> > > > > zombie,
> > > > > simply
> > > > > by looking at it alone.  The additional sense code "LOGICAL
> > > > > UNIT
> > > > > NOT
> > > > > SUPPORTED" that you posted earlier isn't one that I recall
> > > > > seeing
> > > > > for
> > > > > failed multipathd paths.  I'll check around more, but a quick
> > > > > look makes
> > > > > it appear that this code is only used when you are accessing
> > > > > a
> > > > > LUN that
> > > > > really isn't there. It's possible that the TUR checker could
> > > > > return a
> > > > > special path state for this, that would cause multipathd to
> > > > > remove the
> > > > > device.  Also, even if that additional sense code is only
> > > > > supposed to be
> > > > > used for this condition, we should still removing a device
> > > > > that
> > > > > returns
> > > > > it configurable, because I can almost guarantee that there
> > > > > will
> > > > > be a
> > > > > scsi device that does follow the standard for this.
> > > > > 
> > > > 
> > > > Hi Ben,
> > > > You just mentioned *the TUR checker could return a special path
> > > > state
> > > > for this*, what is the special path state?  Thanks~
> > > > 
> > > 
> > > We would have to add a new state, like PATH_NOT_SUPPORTED, that
> > > the
> > > TUR
> > > checker could return in this case.  multipathd could be
> > > configured to
> > > remove the path if it returned this state. If it wasn't
> > > configured to
> > > do
> > > so, multipathd would just change the state to PATH_DOWN.
> > 
> > Is it really multipathd's job to do remove devices that return
> > "LOGICAL
> > UNIT NOT SUPPORTED"? To me it sounds like a misconfiguration on the
> > SCSI/storage level, and I'm unsure if that's a thing multipathd
> > should
> > mess with.
> > 
> > Martin
> > 
> 
> Actually there are two scenario:
> (1)Export the LUN to a server at the same time using different LUN
> nubmer.
> As you mentioned this scenario can be considered a misconfiguration 
> which we might not care about it.
> (2)Export the LUN to a server not at the same time using different
> LUN 
> number.
> This scenario's operation may be right, the customer just want to 
> reassignment the export relations in the storage.
> But the former export operation leave a residual device in the
> system 
> which will been adopted by the latter exported device's multipath.
> Also 
> there are lots of syslog for the former device which actually not 
> exist(at lest customer don't think it exists, the customer want only
> the 
> new exported device exist)

I agree that the "residual device" should be removed from the system.
But I don't think that it's multipathd's assignment to detect and
remove such devices. Well, detect and spit out a message - maybe, but
remove - rather not. multipathd is for managing (dm-)multipath devices,
 not for taking care of arbitrary problems on the storage layer.
That said, I'd be OK with a PATH_NOT_SUPPORTED state that would result
in the paths being treated like orphans or blacklisted devices.

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux