Re: NAA breakage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nicholas,

(b) Are there any restrictions for vpd_unit_serial format in T10
specifications? Now, afaik configfs allows me to set an arbitrary
string...

This is defined as a VENDOR SPECIFIC IDENTIFIER for the NAA IEEE
Registered Extended designator format, so the answer to that would be
no..

(c) If there are no restrictions for the serial number format, NAA
should be probably generated using a hash function (e.g. SHA) instead
of hex2bin. The present implementation can easily produce identical
NAAs for two different serial numbers which is really bad.

Well, we currently expect userspace to be in charge of generating a UUID
that is unique for vpd_serial_number.  hex2bin is simply doing the
conversion for the NAA designator here, and should not be in charge of
making sure what's set in vpd_wwn_serial is really unique.

But if there are no restrictions for vpd_serial_number, it is legal to create target devices with serial numbers eg. "xxxxxxxxxxxxxxxx" and "yyyyyyyyyyyyyyyy". Which will result in the same NAA although the serials are unique, because hex2bin converts all illegal characters to 0xf. Shouldn't vpd_serial_number input be restricted to hex characters only?

(d) IMHO this issue should be solved during this mainline release,
because the growing number of LIO target users will make future fixes
harder.

How severe is the breakage with VMWare here when the NAA WWN changes..?
Does this require a logout ->   relogin from the perspective of the ESX
client..?  Or does this cause issues with on-disk metadata for VMFS that
references existing NAA WWNs..?
Well, first of all, I'm not a VMware expert. Based upon my tests and
research in last two days, this is a serious headache for VMware
ESX(i). ESX>=3.5 uses NAA identifier as a guaranteed unique signature
of a physical volume and saves a copy of NAA to VMFS header. When
establishing a storage session, on-disk signatures of VMFS extents are
compared with the actual NAAs presented by the storage to avoid data
corruption, maintain multiple paths to a single volume etc.

In practice, when I changed NAA of an active VMFS volume with running
VMs, it resulted in an unrecoverable error (see kb.vmware.com/kb/1003416):

"ALERT: NMP: vmk_NmpVerifyPathUID: The physical media represented by
device naa.600140535a4c2c4daa90dd591dc453dd (path vmhba34:C0:T0:L8)
has changed. If this is a data LUN, this is a critical error."
Ugh, so this is actually written into the VMFS header.  So to verify
this again, this only happens when you try to upgrade when the VMFS is
active and mounted, correct..?

Yes, this unrecoverable error happened when I changed NAA of an actively used VMFS volume. More specifically, I upgraded one node of the storage cluster and initiated failover to it. ESXi detected loss of connection, established a new session to the upgraded node, found that NAA differs from what is stored in VMFS header and blocked the volume.

I didn't test NAA change of an inactive unmounted VMFS volume, but I
expect that VMware will treat such a volume as a storage snapshot and
its resignature will be needed. See kb.vmware.com/kb/1011387 or
http://holyhandgrenade.org/blog/2010/07/practical-vmfs-signatures/
blogpost.

Ok, expecting folks to have to unmount an VMFS volume before upgrading
to a v3.1 (eg: new major release) kernel on the target is not completely
out of the question.

Would it be possible for you to verify if this is really the case.


Yes, I'll make more tests to find a way to recover VMware cluster from this and let you know.

Thanks for your answers

Martin

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux