Re: NAA breakage

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Wed, 17 Aug 2011 14:58:53 -0700

On Wed, 2011-08-17 at 13:50 +0200, Martin Svec wrote:
> Hi Nicholas,
> 
> > Mmmm, I think the right solution here would be ignoring the extra '-'
> > characters here at the point that the vpd_unit_serial attribute is set
> > via configfs..  However, this would still obviously still cause an issue
> > of the NAA WWN changing..
> 
> I think the following points should be solved:
> 
> (a) How many existing production setups can be affected in the same 
> way as my lab cluster? My setup is quite special because I run LIO on 
> top of active/passive DRBD, generate my own serials to maintain LUN 
> identities across DRBD nodes, access configfs plane directly using my 
> own library instead of rtsadmin/lio-utils etc. I can easily change the 
> serial number generator because we don't use LIO in production yet, 
> but it does not solve the problem for others.
> 

Yes, stripping out the '-' from the vpd_unit_serial value from the
userspace code setting this value, and configfs parsing code as an extra
safeguard makes the most sense..

> (b) Are there any restrictions for vpd_unit_serial format in T10 
> specifications? Now, afaik configfs allows me to set an arbitrary 
> string...
> 

This is defined as a VENDOR SPECIFIC IDENTIFIER for the NAA IEEE
Registered Extended designator format, so the answer to that would be
no..

> (c) If there are no restrictions for the serial number format, NAA 
> should be probably generated using a hash function (e.g. SHA) instead 
> of hex2bin. The present implementation can easily produce identical 
> NAAs for two different serial numbers which is really bad.
> 

Well, we currently expect userspace to be in charge of generating a UUID
that is unique for vpd_serial_number.  hex2bin is simply doing the
conversion for the NAA designator here, and should not be in charge of
making sure what's set in vpd_wwn_serial is really unique.

> (d) IMHO this issue should be solved during this mainline release, 
> because the growing number of LIO target users will make future fixes 
> harder.
> 
> > How severe is the breakage with VMWare here when the NAA WWN changes..?
> > Does this require a logout ->  relogin from the perspective of the ESX
> > client..?  Or does this cause issues with on-disk metadata for VMFS that
> > references existing NAA WWNs..?
> 
> Well, first of all, I'm not a VMware expert. Based upon my tests and 
> research in last two days, this is a serious headache for VMware 
> ESX(i). ESX >=3.5 uses NAA identifier as a guaranteed unique signature 
> of a physical volume and saves a copy of NAA to VMFS header. When 
> establishing a storage session, on-disk signatures of VMFS extents are 
> compared with the actual NAAs presented by the storage to avoid data 
> corruption, maintain multiple paths to a single volume etc.
>
> In practice, when I changed NAA of an active VMFS volume with running 
> VMs, it resulted in an unrecoverable error (see kb.vmware.com/kb/1003416):
> 
> "ALERT: NMP: vmk_NmpVerifyPathUID: The physical media represented by 
> device naa.600140535a4c2c4daa90dd591dc453dd (path vmhba34:C0:T0:L8) 
> has changed. If this is a data LUN, this is a critical error."
> 

Ugh, so this is actually written into the VMFS header.  So to verify
this again, this only happens when you try to upgrade when the VMFS is
active and mounted, correct..?

> I didn't test NAA change of an inactive unmounted VMFS volume, but I 
> expect that VMware will treat such a volume as a storage snapshot and 
> its resignature will be needed. See kb.vmware.com/kb/1011387 or 
> http://holyhandgrenade.org/blog/2010/07/practical-vmfs-signatures/ 
> blogpost.
> 

Ok, expecting folks to have to unmount an VMFS volume before upgrading
to a v3.1 (eg: new major release) kernel on the target is not completely
out of the question.  

Would it be possible for you to verify if this is really the case.

> In all cases, nontrivial effort is probably necessary to make it work 
> again. It seems to me that the easiest solution (and the only solution 
> without downtime) is to migrate all VMs to another shared storage 
> using Storage vMotion, destroy the VMFS volume, change NAA, recreate 
> VMFS and migrate VMs back. (But if somebody else know an easy way to 
> restore active VMFS volume after NAA change, please tell me :-))
> 

Ok, i'll likely have to end up reverting this to the old logic to avoid
this all together for -rc3, but I would really like to know the severity
of this first for the 'inactive umounted VMFS volume' case.  I think
that forcing existing users to have to do this is not completely out of
the question when upgrading from out-of-tree -> mainline code, but we
need to ensure that VMFS is intelligent enough to recover from this in
the first place.

Thank you for your input here,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html