Re: Setting iSCSI SCSI ID with LIO?

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Mon, 21 Dec 2015 00:11:44 -0800

On Thu, 2015-12-10 at 01:55 +0000, Sam McLeod wrote:
> >>>The Company ID, VSI, and VSIE are generated by LIO based upon the
> >>> current vpd_unit_serial configfs attribute value.
> >>> 
> >>> So as long as vpd_unit_serial is persistent, and the same value for
> >>> backend devices across export failover to different nodes, Xen will
> >>> always see the same EVPD information.
> >>> 
> >>> Are you saying that vpd_unit_serial is already persistent across export
> >>> failover, but Xen is still having problems..?
> >>> 
> >>> Have you confirmed with sg_inq -i both before and after the export
> >>> failover occurs..?
> 
> Hi Nicholas,
> 
> Sorry for how long it's taken me to reply but I wanted to let you (and
> the mailing list) know this is this resolved with great thanks to your
> explanation of how the vpd_unit_serial works in relation to the SCSI
> ID.
> 
> Once we enforced the vpd_unit_serial on each of the LUNs we can
> consistently fail over between iSCSI servers without the SCSI ID
> changing.
> 
> For reference for those using Pacemaker + Corosync with the LIO
> target:
> 
> 
> primitive iscsi_lun_r1 iSCSILogicalUnit \
>         op monitor timeout=10s interval=30s on-fail=restart \
>         op start timeout=20s interval=0 on-fail=restart \
>         op stop timeout=20s interval=0 on-fail=restart \
>         params
> target_iqn="iqn.2003-01.org.linux-iscsi.s1-san5.x8664:sn.cb568058d955"
> scsi_sn=bff3f42a-49d8-4cfc-b64e-2b933e98141d lun=1 path="/dev/drbd1"
> allowed_initiators="iqn.2015-05.com.example:516c8f8c
> iqn.2015-06.com.example:2dcd27e0 iqn.2013-09.com.example:e611b8f2
> iqn.2013-11.com.example:aef3bcea iqn.2015-06.com.example:3577646c
> iqn.2015-05.com.example:3367ed85 iqn.2015-07.com.example:0467ccce
> iqn.2015-11.com.example:40ee457b" implementation=lio-t
>
> Note the scsi_sn parameter being passed in, this is what enforces the
> vpd_unit_serial as per
> https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/iSCSILogicalUnit#L367
> 
> Such a simple fix to something that for a long time we thought was
> unrelated.
> I plan to write a quick blog post up on this as there are a lot of
> other people having this issue with Xen and it's clearly quite easy to
> fix when you understand the relationship as you pointed out.

(Adding Florian + JXM CC')

Thanks for following up on your original post.

Yes, this default resource-agent behavior has caused endless amounts of
confusion to end-users over the years.

It's difficult to imagine a case where vpd_unit_serial persistence
should not be happening during LIO backend + export fail-over between
cluster nodes.

Or at least, there should be a giant warning or something.

That said, I have no idea who is maintaining the HA resource-agents
stuff these days, but it would certainly be a good idea to add this bug
here:

https://github.com/ClusterLabs/resource-agents/issues

Would you be so kind to articulate this bug on github, and what you've
done beyond the defaults in order to have a working setup..?

Thank you,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html