On Mon, Dec 21, 2015 at 9:11 AM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > On Thu, 2015-12-10 at 01:55 +0000, Sam McLeod wrote: >> >>>The Company ID, VSI, and VSIE are generated by LIO based upon the >> >>> current vpd_unit_serial configfs attribute value. >> >>> >> >>> So as long as vpd_unit_serial is persistent, and the same value for >> >>> backend devices across export failover to different nodes, Xen will >> >>> always see the same EVPD information. >> >>> >> >>> Are you saying that vpd_unit_serial is already persistent across export >> >>> failover, but Xen is still having problems..? >> >>> >> >>> Have you confirmed with sg_inq -i both before and after the export >> >>> failover occurs..? >> >> Hi Nicholas, >> >> Sorry for how long it's taken me to reply but I wanted to let you (and >> the mailing list) know this is this resolved with great thanks to your >> explanation of how the vpd_unit_serial works in relation to the SCSI >> ID. >> >> Once we enforced the vpd_unit_serial on each of the LUNs we can >> consistently fail over between iSCSI servers without the SCSI ID >> changing. >> >> For reference for those using Pacemaker + Corosync with the LIO >> target: >> >> >> primitive iscsi_lun_r1 iSCSILogicalUnit \ >> op monitor timeout=10s interval=30s on-fail=restart \ >> op start timeout=20s interval=0 on-fail=restart \ >> op stop timeout=20s interval=0 on-fail=restart \ >> params >> target_iqn="iqn.2003-01.org.linux-iscsi.s1-san5.x8664:sn.cb568058d955" >> scsi_sn=bff3f42a-49d8-4cfc-b64e-2b933e98141d lun=1 path="/dev/drbd1" >> allowed_initiators="iqn.2015-05.com.example:516c8f8c >> iqn.2015-06.com.example:2dcd27e0 iqn.2013-09.com.example:e611b8f2 >> iqn.2013-11.com.example:aef3bcea iqn.2015-06.com.example:3577646c >> iqn.2015-05.com.example:3367ed85 iqn.2015-07.com.example:0467ccce >> iqn.2015-11.com.example:40ee457b" implementation=lio-t >> >> Note the scsi_sn parameter being passed in, this is what enforces the >> vpd_unit_serial as per >> https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/iSCSILogicalUnit#L367 >> >> Such a simple fix to something that for a long time we thought was >> unrelated. >> I plan to write a quick blog post up on this as there are a lot of >> other people having this issue with Xen and it's clearly quite easy to >> fix when you understand the relationship as you pointed out. > > (Adding Florian + JXM CC') > > Thanks for following up on your original post. > > Yes, this default resource-agent behavior has caused endless amounts of > confusion to end-users over the years. > > It's difficult to imagine a case where vpd_unit_serial persistence > should not be happening during LIO backend + export fail-over between > cluster nodes. > > Or at least, there should be a giant warning or something. > > That said, I have no idea who is maintaining the HA resource-agents > stuff these days, but it would certainly be a good idea to add this bug > here: > > https://github.com/ClusterLabs/resource-agents/issues > > Would you be so kind to articulate this bug on github, and what you've > done beyond the defaults in order to have a working setup..? Yes, please. From the thread above, and from the ML archive, I can't really tell just what the problem exactly is, and what the suggested fix would be (granted, I may be missing some context). As far as i can tell the RA already does the right thing by accepting a parameter enabling users to persist scsi_sn. It also tries to generate a suitable value for you by default, based on the Pacemaker resource ID. Additional insight would be appreciated. Cheers, Florian -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html