Antw: [EXT] Re: RFC: one more time: SCSI device identification

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>> Hannes Reinecke <hare@xxxxxxx> schrieb am 27.04.2021 um 10:21 in Nachricht
<2a6903e4-ff2b-67d5-e772-6971db8448fb@xxxxxxx>:
> On 4/27/21 10:10 AM, Martin Wilck wrote:
>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>
>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>> afaics.
>>>>
>>> In my view the WWID should never change. 
>> 
>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>> that WWID changes can happen with certain storage arrays. See 
>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html 
>> and follow‑ups, for example.
>> 
> And it's actually something which might happen quite easily.
> The storage array can unmap a LUN, delete it, create a new one, and map
> that one into the same LUN number than the old one.
> If we didn't do I/O during that interval upon the next I/O we will be
> getting the dreaded 'Power‑On/Reset' sense code.
> _And nothing else_, due to the arcane rules for sense code generation in
> SAM.
> But we end up with a completely different device.
> 
> The only way out of it is to do a rescan for every POR sense code, and
> disable the device eg via DID_NO_CONNECT whenever we find that the
> identification has changed. We already have a copy of the original VPD
> page 0x83 at hand, so that should be reasonably easy.

I don't know the depth of the SCSI or FC protocol, but storage systems
typically signal such events, maybe either via some unit attention or some FC
event. Older kernels logged that there was a change, but a manual SCSI bus scan
is needed, while newer kernels find new devices "automagically" for some
products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
need something like a FC LIP to make the kernel detect the new devices (LUNs).
I'm unsure where the problem is, but in principle the kernel can be
notified...

> 
> I had a rather lengthy discussion with Fred Knight @ NetApp about
> Power‑On/Reset handling, what with him complaining that we don't handle
> is correctly. So this really is something we should be looking into,
> even independently of multipathing.
> 
> But actually I like the idea from Martin Petersen to expose the parsed
> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
> from the udev rules.

Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
kernel change regarding trailing blanks in VPD data. That change blew up
several configurations being unable to re-recognize the devices. In one case
the software even had bound a license to a specific device with serial number,
and that software found "new" devices while missing the "old" ones...

Regards,
Ulrich

> 
> Cheers,
> 
> Hannes
> ‑‑ 
> Dr. Hannes Reinecke		        Kernel Storage Architect
> hare@xxxxxxx			               +49 911 74053 688
> SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> GF: F. Imendörffer, HRB 36809 (AG Nürnberg)




--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux