Hi Mike, Mike Christie <mchristi <at> redhat.com> writes: > Yeah, ESXi requires that it gets the same info from all ports that the > LU can be accessed through. This includes group id values and also ALUA > states. In your example you had group id 2 showing standby on kio1, but > that group did not exist on e1. Also e1 had a group id 1, but kio1 did not. > > > For failover, how are you handling the alua state changes? In my example > above e1 has the active paths. If they went down, ESXi would send a STPG > to kio1. We would then need to resync the alua state. It would look > something like this where group id 0 goes to active group id 1 goes to > standby on both nodes [note the status info might be wrong below but you > get the idea with the groups and alua states changing]: > > root <at> kio1:/sys/kernel/config/target# tcm_node --listtgptgps > iblock_4711/p_FCLun_test2 > \------> kio1 Target Port Group ID: 0 > Active ALUA Access Type(s): Implicit and Explicit > Primary Access State: Active/Optimized > Primary Access Status: Altered by Explicit ALUA > Preferred Bit: 0 > Active/NonOptimized Delay in milliseconds: 100 > Transition Delay in milliseconds: 0 > \------> TG Port Group Members > qla2xxx/naa.21000024ff4f0dee/tpgt_1/lun_1 > qla2xxx/naa.21000024ff4f0def/tpgt_1/lun_1 > > \------> default_tg_pt_gp Target Port Group ID: 1 > Active ALUA Access Type(s): Implicit and Explicit > Primary Access State: Standby > Primary Access Status: None > Preferred Bit: 0 > Active/NonOptimized Delay in milliseconds: 100 > Transition Delay in milliseconds: 0 > \------> TG Port Group Members > No Target Port Group Members > > root <at> e1:/var/log# tcm_node --listtgptgps iblock_4711/p_FCLun_test > \------> e1 Target Port Group ID: 1 > Active ALUA Access Type(s): Implicit and Explicit > Primary Access State: Standby > Primary Access Status: None > Preferred Bit: 0 > Active/NonOptimized Delay in milliseconds: 100 > Transition Delay in milliseconds: 0 > \------> TG Port Group Members > qla2xxx/naa.21000024ff4f0f20/tpgt_1/lun_1 > qla2xxx/naa.21000024ff4f0f21/tpgt_1/lun_1 > > \------> default_tg_pt_gp Target Port Group ID: 0 > Active ALUA Access Type(s): Implicit and Explicit > Primary Access State: Active/Optimized > Primary Access Status: None > Preferred Bit: 0 > Active/NonOptimized Delay in milliseconds: 100 > Transition Delay in milliseconds: 0 > \------> TG Port Group Members > No Target Port Group Members If I understand you correctly, The following configuration must be present when the LUN is active on node e1 : Node e1 (LUN is a member of both groups): TPG name: e1, TPG ID = 1, ALUA state: Active/Optimized TPG name: kio1, TPG ID = 2, ALUA state: Standby Node kio1 (LUN is *not* a member of either group): TPG name: e1, TPG ID = 1, ALUA state: Active/Optimized TPG name: kio1, TPG ID = 2, ALUA state: Standby And this would be the change when we want LUN to switch to node kio1 (or node e1 crashes): Node e1 (if present)(LUN is *not* a member of either group): TPG name: e1, TPG ID = 1, ALUA state: Standby TPG name: kio1, TPG ID = 2, ALUA state: Active/Optimized Node kio1 (LUN is a member of both groups): TPG name: e1, TPG ID = 1, ALUA state: Standby TPG name: kio1, TPG ID = 2, ALUA state: Active/Optimized But shouldn't it work to just have one TPG with the same name and ID on both nodes and simply switch the ALUA state and membership of the LUN on failover? We are preparing to test failover under load and will post results the coming week. Best regards, RW -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html