Re: Non persistent SCSI serial (word 83)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for your prompt response. You pointed at my problem.

Just to conclude - there is a bug in Redhat's iSCSILogicalUnit script
agent for 'PCS' HA cluster (the latest generation of 'Redhat Cluster
Suite') which incorrectly sets the SN. Now, that I have worked around
the bug, I will see whom I need to notify about my solution, so Redhat
merges it into their suite.

Thanks!
Etzion

On 23 August 2015 at 01:00, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote:
> On Sat, 2015-08-22 at 01:33 +0300, Etzion Bar-Noy wrote:
>> Hi. I have been looking for a solution for a while now, and found
>> none, so this post here is my last attempt to solve the LIO issue i am
>> encountering, before I give up and move to some other solution...
>> Description:
>> OS: Centos 7.1, latest updates (correct for Aug, 2015).
>> Targetcli version: rpm -qa | grep targetcli
>> targetcli-2.1.fb37-3.el7.noarch
>> Kernel: uname -a
>> Linux controller1 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18
>> UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>>
>> If there's anything missing, let me know.
>>
>> Problem summary: In a PCS-based HA cluster, when failing over the LUN,
>> the lun serial changes, and this causes multipath clients to misbehave
>> (especially after an iSCSI client reboot).
>> Some more of the setup: the setup makes use of two nodes with
>> PCS-based cluster. The cluster setup was a modified follow up of this
>> site: https://bm-stor.com/index.php/blog/Linux-cluster-with-ZFS-on-Cluster-in-a-Box/
>> , except that I use multipathing and not network teaming.
>> iSCSI layout:
>> targetcli ls
>> o- / .........................................................................................................................
>> [...]
>>   o- backstores
>> ..............................................................................................................
>> [...]
>>   | o- block ..................................................................................................
>> [Storage Objects: 2]
>>   | | o- lun2-tier2
>> ............................................................
>> [/dev/mapper/T2-lun2 (6.6TiB) write-thru activated]
>>   | | o- lun3-tier3
>> .........................................................
>> [/dev/mapper/T3-lun1 (1024.0GiB) write-thru activated]
>>   | o- fileio .................................................................................................
>> [Storage Objects: 0]
>>   | o- pscsi ..................................................................................................
>> [Storage Objects: 0]
>>   | o- ramdisk ................................................................................................
>> [Storage Objects: 0]
>>   o- iscsi ............................................................................................................
>> [Targets: 2]
>>   | o- iqn.2005-05.com.poliva:cib.tier2
>> ..................................................................................
>> [TPGs: 1]
>>   | | o- tpg1 ..................................................................................................
>> [gen-acls, no-auth]
>>   | |   o- acls
>> ..........................................................................................................
>> [ACLs: 0]
>>   | |   o- luns
>> ..........................................................................................................
>> [LUNs: 1]
>>   | |   | o- lun2
>> .........................................................................
>> [block/lun2-tier2 (/dev/mapper/T2-lun2)]
>>   | |   o- portals
>> ....................................................................................................
>> [Portals: 2]
>>   | |     o- 10.254.254.4:3260
>> ................................................................................................
>> [OK]
>>   | |     o- 10.254.255.4:3260
>> ................................................................................................
>> [OK]
>>   | o- iqn.2005-05.com.poliva:cib.tier3
>> ..................................................................................
>> [TPGs: 1]
>>   |   o- tpg1 ..................................................................................................
>> [gen-acls, no-auth]
>>   |     o- acls
>> ..........................................................................................................
>> [ACLs: 0]
>>   |     o- luns
>> ..........................................................................................................
>> [LUNs: 1]
>>   |     | o- lun3
>> .........................................................................
>> [block/lun3-tier3 (/dev/mapper/T3-lun1)]
>>   |     o- portals
>> ....................................................................................................
>> [Portals: 2]
>>   |       o- 10.254.254.5:3260
>> ................................................................................................
>> [OK]
>>   |       o- 10.254.255.5:3260
>> ................................................................................................
>> [OK]
>>   o- loopback .........................................................................................................
>> [Targets: 0]
>>
>> I do not use (unless required to) ACLs for the time being.
>> After a LUN takeover/takeback (aka - relocation to another host), the
>> IP address is backup  (within about 10-20 seconds), the iSCSI target
>> is up and available, and all cluster resources show as healthy.,
>> however, the client host, especially if rebooted, will not see the
>> same identifier for this LUN (serial, word 83, name it as you like).
>> It is 100% reproducible if two conditions happen:
>> 1. The LUN is migrated to the other node
>> 2. The client machine is rebooted (order is optional).
>> # multipath -ll
>> mpathc (36001405e3de7a9800000000000000000) dm-2 LIO-ORG,lun3-tier3
>> size=1024G features='1 queue_if_no_path' hwhandler='0' wp=rw
>> `-+- policy='round-robin 0' prio=1 status=active
>>   |- 0:0:0:3 sdb  8:16  active ready running
>>   `- 1:0:0:3 sda  8:0   active ready running
>> [root@temp-iSCSI ~]# multipath -F
>> [root@temp-iSCSI ~]# iscsiadm -m node -U all
>> Logging out of session [sid: 1, target:
>> iqn.2005-05.com.poliva:cib.tier3, portal: 10.254.255.5,3260]
>> Logging out of session [sid: 2, target:
>> iqn.2005-05.com.poliva:cib.tier3, portal: 10.254.254.5,3260]
>> Logout of [sid: 1, target: iqn.2005-05.com.poliva:cib.tier3, portal:
>> 10.254.255.5,3260] successful.
>> Logout of [sid: 2, target: iqn.2005-05.com.poliva:cib.tier3, portal:
>> 10.254.254.5,3260] successful.
>> [root@temp-iSCSI ~]# iscsiadm -m node -L all
>> Logging in to [iface: default, target:
>> iqn.2005-05.com.poliva:cib.tier3, portal: 10.254.255.5,3260]
>> (multiple)
>> Logging in to [iface: default, target:
>> iqn.2005-05.com.poliva:cib.tier3, portal: 10.254.254.5,3260]
>> (multiple)
>> Login to [iface: default, target: iqn.2005-05.com.poliva:cib.tier3,
>> portal: 10.254.255.5,3260] successful.
>> Login to [iface: default, target: iqn.2005-05.com.poliva:cib.tier3,
>> portal: 10.254.254.5,3260] successful.
>> [root@temp-iSCSI ~]# multipath
>> [root@temp-iSCSI ~]# multipath -ll
>> mpathd (3600140599e044f8681345d3aa4824abc) dm-2 LIO-ORG,lun3-tier3
>> size=1024G features='1 queue_if_no_path' hwhandler='0' wp=rw
>> `-+- policy='round-robin 0' prio=1 status=active
>>   |- 2:0:0:3 sdb  8:16  active ready running
>>   `- 3:0:0:3 sda  8:0   active ready running
>>
>>
>> I am not using ACLs for the time being. I will integrate ACLs later on.
>>
>> Thanks for any insight, or even a simple tip on how I can maintain
>> dedicated HA solution.
>
> The backend device UUID (and EVPD=0x83 that uses it) is set in
>
>    /sys/kernel/config/target/core/$HBA/$DEV/wwn/vpd_unit_serial
>
> From the looks of it, your H/A scripts are resetting it to something new
> each time export fail-over occurs.
>
> You'll need to make sure it's using the same value on both nodes, to
> ensure a consistent view to active initiators.
>
> --nab
>
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux