Re: Setting iSCSI SCSI ID with LIO?

Florian Haas <florian@xxxxxxxxxxx> · Mon, 21 Dec 2015 11:52:39 +0100

On Mon, Dec 21, 2015 at 9:11 AM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
> On Thu, 2015-12-10 at 01:55 +0000, Sam McLeod wrote:
>> >>>The Company ID, VSI, and VSIE are generated by LIO based upon the
>> >>> current vpd_unit_serial configfs attribute value.
>> >>>
>> >>> So as long as vpd_unit_serial is persistent, and the same value for
>> >>> backend devices across export failover to different nodes, Xen will
>> >>> always see the same EVPD information.
>> >>>
>> >>> Are you saying that vpd_unit_serial is already persistent across export
>> >>> failover, but Xen is still having problems..?
>> >>>
>> >>> Have you confirmed with sg_inq -i both before and after the export
>> >>> failover occurs..?
>>
>> Hi Nicholas,
>>
>> Sorry for how long it's taken me to reply but I wanted to let you (and
>> the mailing list) know this is this resolved with great thanks to your
>> explanation of how the vpd_unit_serial works in relation to the SCSI
>> ID.
>>
>> Once we enforced the vpd_unit_serial on each of the LUNs we can
>> consistently fail over between iSCSI servers without the SCSI ID
>> changing.
>>
>> For reference for those using Pacemaker + Corosync with the LIO
>> target:
>>
>>
>> primitive iscsi_lun_r1 iSCSILogicalUnit \
>>         op monitor timeout=10s interval=30s on-fail=restart \
>>         op start timeout=20s interval=0 on-fail=restart \
>>         op stop timeout=20s interval=0 on-fail=restart \
>>         params
>> target_iqn="iqn.2003-01.org.linux-iscsi.s1-san5.x8664:sn.cb568058d955"
>> scsi_sn=bff3f42a-49d8-4cfc-b64e-2b933e98141d lun=1 path="/dev/drbd1"
>> allowed_initiators="iqn.2015-05.com.example:516c8f8c
>> iqn.2015-06.com.example:2dcd27e0 iqn.2013-09.com.example:e611b8f2
>> iqn.2013-11.com.example:aef3bcea iqn.2015-06.com.example:3577646c
>> iqn.2015-05.com.example:3367ed85 iqn.2015-07.com.example:0467ccce
>> iqn.2015-11.com.example:40ee457b" implementation=lio-t
>>
>> Note the scsi_sn parameter being passed in, this is what enforces the
>> vpd_unit_serial as per
>> https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/iSCSILogicalUnit#L367
>>
>> Such a simple fix to something that for a long time we thought was
>> unrelated.
>> I plan to write a quick blog post up on this as there are a lot of
>> other people having this issue with Xen and it's clearly quite easy to
>> fix when you understand the relationship as you pointed out.
>
> (Adding Florian + JXM CC')
>
> Thanks for following up on your original post.
>
> Yes, this default resource-agent behavior has caused endless amounts of
> confusion to end-users over the years.
>
> It's difficult to imagine a case where vpd_unit_serial persistence
> should not be happening during LIO backend + export fail-over between
> cluster nodes.
>
> Or at least, there should be a giant warning or something.
>
> That said, I have no idea who is maintaining the HA resource-agents
> stuff these days, but it would certainly be a good idea to add this bug
> here:
>
> https://github.com/ClusterLabs/resource-agents/issues
>
> Would you be so kind to articulate this bug on github, and what you've
> done beyond the defaults in order to have a working setup..?

Yes, please. From the thread above, and from the ML archive, I can't
really tell just what the problem exactly is, and what the suggested
fix would be (granted, I may be missing some context). As far as i can
tell the RA already does the right thing by accepting a parameter
enabling users to persist scsi_sn. It also tries to generate a
suitable value for you by default, based on the Pacemaker resource ID.

Additional insight would be appreciated.

Cheers,
Florian
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html