On 06/06/2012 10:59 PM, Brian Bunker wrote: > Mike, > > The devices for LUN 12 are failed and correspond to LUN's not currently shared > to the initiator at all. They were at one point and were likely used by dm-11 > for its underlying paths. The inquiry data of those LUN's when the problem happened was like this: > > [root@r13init32 ~]# sg_inq /dev/sde > standard INQUIRY: [qualifier indicates no connected LU] > PQual=1 Device_type=31 RMB=0 version=0x06 [SPC-4] > [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=2 > SCCS=0 ACC=0 TPGS=0 3PC=0 Protect=0 BQue=0 > EncServ=0 MultiP=1 (VS=0) [MChngr=0] [ACKREQQ=0] Addr16=0 > [RelAdr=0] WBus16=0 Sync=0 Linked=0 [TranDis=0] CmdQue=1 > [SPI: Clocking=0x0 QAS=0 IUS=0] > length=96 (0x60) Peripheral device type: no physical device on this lu > Vendor identification: PURE > Product identification: FlashArray > Product revision level: 100 > > There is no NAA number, page code 0x83 or LUN serial number available, page code 0x80 > since there is no LUN 12 attached as a disk device at the time multipath -ll was run. > Different LUN's from our array would ever have the same NAA value, what I think you are calling UUID. > Yep. Hmm. So the devices are unmapped from the storage, but still visible from the initiator? Have you run 'rescan-scsi-bus.sh -r' here? That should clean up these devices. > The sequence is something like share a LUN from the array with two paths to > the initiator, a dm device gets created presumably like this at first (except > that the status would be active and ready and not failed and faulty: > > 3624a93700a14254d729923840001000b dm-11 PURE,FlashArray > size=500G features='0' hwhandler='0' wp=rw > `-+- policy='round-robin 0' prio=1 status=active > |- 1:0:0:12 sde 8:64 failed faulty running > |- 0:0:0:12 sdd 8:48 failed faulty running > > Then that LUN 12 is taken away from the initiator and the dm device dm-11 is > reused later by LUN 10 when it is shared to the initiator, but the LUN 12 > devices still remain as part of the dm device. Then I would expect: > > 3624a93700a14254d729923840001000b dm-11 PURE,FlashArray > size=500G features='0' hwhandler='0' wp=rw > `-+- policy='round-robin 0' prio=1 status=active > |- 0:0:0:10 sdar 66:176 active ready running > !- 1:0:0:10 sdba 67:64 active ready running > Yeah, but still: it means that at one point LUN 12 had the same NAA value than LUN 10, correct? It _might_ happen that multipath created a dm-device for LUN12, set them to 'faulty' during unsharing, and then added the then-new LUN10 to the same device, given that the NAA number is identical. So the point still stands: LUN10 must have had the same NAA value than LUN12 now has. So unless the original LUN10 referred to the same storage entity as LUN12 now does, this is a definite no-no. And if it does, we're pretty much in the clear, as then LUN10 would now refer to a stale device (with status 'failed faulty'), and should be cleared up with 'rescan-scsi-bus.sh -r'. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel