On Tue, 2018-03-20 at 03:19 +0000, Chongyun Wu wrote: > On 2018/3/20 5:42, Martin Wilck wrote: > > On Fri, 2018-03-09 at 10:22 -0600, Benjamin Marzinski wrote: > > > On Fri, Mar 09, 2018 at 06:47:30AM +0000, Chongyun Wu wrote: > > > > On 2018/3/8 23:45, Benjamin Marzinski wrote: > > > > > > > > > > If there are multiple routes to the storage, Some of them can > > > > > be > > > > > down, > > > > > even if everything is fine on the storage. This will cause > > > > > some > > > > > paths > > > > > to be up and some to be down, regardless of the state of the > > > > > LUN. > > > > > In > > > > > every other multipath case but this one, there is just one > > > > > LUN, > > > > > and not > > > > > all the paths have the same state. > > > > > > > > > > Ideally, there would be a way to determine if a path is a > > > > > zombie, > > > > > simply > > > > > by looking at it alone. The additional sense code "LOGICAL > > > > > UNIT > > > > > NOT > > > > > SUPPORTED" that you posted earlier isn't one that I recall > > > > > seeing > > > > > for > > > > > failed multipathd paths. I'll check around more, but a quick > > > > > look makes > > > > > it appear that this code is only used when you are accessing > > > > > a > > > > > LUN that > > > > > really isn't there. It's possible that the TUR checker could > > > > > return a > > > > > special path state for this, that would cause multipathd to > > > > > remove the > > > > > device. Also, even if that additional sense code is only > > > > > supposed to be > > > > > used for this condition, we should still removing a device > > > > > that > > > > > returns > > > > > it configurable, because I can almost guarantee that there > > > > > will > > > > > be a > > > > > scsi device that does follow the standard for this. > > > > > > > > > > > > > Hi Ben, > > > > You just mentioned *the TUR checker could return a special path > > > > state > > > > for this*, what is the special path state? Thanks~ > > > > > > > > > > We would have to add a new state, like PATH_NOT_SUPPORTED, that > > > the > > > TUR > > > checker could return in this case. multipathd could be > > > configured to > > > remove the path if it returned this state. If it wasn't > > > configured to > > > do > > > so, multipathd would just change the state to PATH_DOWN. > > > > Is it really multipathd's job to do remove devices that return > > "LOGICAL > > UNIT NOT SUPPORTED"? To me it sounds like a misconfiguration on the > > SCSI/storage level, and I'm unsure if that's a thing multipathd > > should > > mess with. > > > > Martin > > > > Actually there are two scenario: > (1)Export the LUN to a server at the same time using different LUN > nubmer. > As you mentioned this scenario can be considered a misconfiguration > which we might not care about it. > (2)Export the LUN to a server not at the same time using different > LUN > number. > This scenario's operation may be right, the customer just want to > reassignment the export relations in the storage. > But the former export operation leave a residual device in the > system > which will been adopted by the latter exported device's multipath. > Also > there are lots of syslog for the former device which actually not > exist(at lest customer don't think it exists, the customer want only > the > new exported device exist) I agree that the "residual device" should be removed from the system. But I don't think that it's multipathd's assignment to detect and remove such devices. Well, detect and spit out a message - maybe, but remove - rather not. multipathd is for managing (dm-)multipath devices, not for taking care of arbitrary problems on the storage layer. That said, I'd be OK with a PATH_NOT_SUPPORTED state that would result in the paths being treated like orphans or blacklisted devices. Regards, Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel