Re: WWID changes after Target reboot - Linux HA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/01/15 21:38, Nicholas A. Bellinger wrote:
> Hi Daniel,
>
> On Mon, 2014-12-29 at 16:15 +0000, Daniel Piddock wrote:
>> Hi,
>>
>> I was hoping somebody would be able to help with an issue that has me
>> stumped. Sorry if this has been asked before but I haven't found a
>> similar issue.
>>
>> I have Corosync+Pacemaker serving up an iSCSI target, using
>> ocf::heartbeat:iSCSITarget and ocf::heartbeat:iSCSILogicalUnit. The
>> backstore is an iblock of an LVM LV stored in DRBD on a local drive
>> partition.
>>
>> /dev/sda3 - DRBD physical partition
>> /dev/drbd0 - LVM PV
>> /dev/VG0/iscsi1 - LVM LV iblock for iSCSI
>>
>> Every time that the target is removed and restored the WWID as seen by
>> the initiators changes (fetched with "/lib/udev/scsi_id -g -u -d
>> /dev/sdc"). I have multiple network routes so this changing of WWID
>> seriously upsets multipath.
>>
>> For example, WWID is currently 36001405caa429dc7e474943bc6026541.
>> Previously it was 360014057b86b6a48432485fb4201da9d
>>
>> Any pointers on how to make the WWID static?
>>
>> Target servers and initiators are running Debian stable with backport
>> kernels.
>> linux-image-3.16.0-0.bpo.4-amd64 3.16.7-ckt2-1~bpo70+1
>> targetcli 2.0rc1-2
>> lio-utils 3.1+git2.fd0b34fd-2
>> python-rtslib 2.1-2
>>
>> Frustratingly I have access to an almost identical setup where the WWID
>> is static. The only difference I can spot is that DRBD is on a separate
>> drive without a partition table.
>>
> So using the legacy lio-utils CLI, an existing device unit serial is set
> after tcm_node --establishdev $HBA/$DEV time using:
>
>    tcm_node --setunitserialwithmd $HBA/$DEV $UNIT_SERIAL
>
> It sounds like tcm_node --block is being called incorrectly each time
> during fail-over by the ocf script to configure the backend device,
> instead of just the first time to generate the initial $UNIT_SERIAL.
>
> It would be useful to verify exactly which tcm_node CLI operations are
> occurring on both setups in order to diagnose why one setup is doing the
> right thing to reset an existing $UNIT_SERIAL using
> --setunitserialwithmd, and the other is incorrectly using --block to
> generate a new $UNIT_SERIAL each time.
>
> --nab

Hi nab,

Thank you for your response. I was fiddling about on my "broken" test
box and finally noticed the cause of the issue in syslog:
lrmd: [2601]: info: RA output: (iscsi.lun.test:start:stderr)
/usr/lib/ocf/resource.d//heartbeat/iSCSILogicalUnit: line 56: openssl:
command not found

I must have missed it in all the noise (corosync/pacemaker is so
chatty). Never would have spotted if I'd reduced the logging level though.

Installing openssl results in a predictable wwid getting generated.

Cheers,
Dan
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux