Erez Zilber wrote:
Hi,
I'm trying to run open-iscsi with the multipath tool. I was able to
create a scenario in which the initiator machine hangs:
My initiator runs on a SLES 10 beta 8 machine. The db contains 2 targets:
iscsiadm -m node
[83347a] 192.168.10.106:3260,1 iqn.2005-12.com.voltaire.206000C0FF07C1D1
[030479] 192.168.10.105:3260,1 iqn.2005-12.com.voltaire.206000C0FF07C1D1
Loading the dm modules:
salt:~ # modprobe dm-mod; modprobe dm-multipath
I start the initiator:
salt:~ # /etc/init.d/open-iscsi start
Starting iSCSI initiator service: done
Logging into iqn.2005-12.com.voltaire.206000C0FF07C1D1: done
Logging into iqn.2005-12.com.voltaire.206000C0FF07C1D1: done
salt:~ # sg_map -i -x
/dev/sg0 0 0 0 0 0 /dev/sda DotHill SANnet II FC 411I
/dev/sg1 1 0 0 0 0 /dev/sdb DotHill SANnet II FC 411I
Now, I start the multipath daemon:
salt:~ # ls /dev/mapper/
control
salt:~ # multipathd
salt:~ # ls /dev/mapper/
control mpath0
salt:~ # multipath -l
mpath0 (3600c0ff00000000007c1d121c397f503)
[size=10 GB][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 0:0:0:0 sda 8:0 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:0 sdb 8:16 [active][undef]
Running some traffic:
salt:~ # dd if=/dev/mapper/mpath0 of=/dev/null count=500000
500000+0 records in
500000+0 records out
256000000 bytes (256 MB) copied, 9.54608 seconds, 26.8 MB/s
Now, I disconnect (unplug the cable) the target which is represented by
/dev/sda and run another dd command:
salt:~ # dd if=/dev/mapper/mpath0 of=/dev/null count=500000
500000+0 records in
500000+0 records out
256000000 bytes (256 MB) copied, 199.601 seconds, 1.3 MB/s
Now, the 1st priority group has failed:
salt:~ # multipath -l
mpath0 (3600c0ff00000000007c1d121c397f503)
[size=10 GB][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:0:0 sda 8:0 [failed][faulty]
\_ round-robin 0 [prio=0][active]
\_ 1:0:0:0 sdb 8:16 [active][undef]
I reconnect the target and it's back to life:
salt:~ # multipath -l
mpath0 (3600c0ff00000000007c1d121c397f503)
[size=10 GB][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:0:0 sda 8:0 [active][undef]
\_ round-robin 0 [prio=0][active]
\_ 1:0:0:0 sdb 8:16 [active][undef]
I stop the initiator and the machine hangs:
salt:~ # /etc/init.d/open-iscsi stop
Logging out from iqn.2005-12.com.voltaire.206000C0FF07C1D1: done
Logging out from iqn.2005-12.com.voltaire.206000C0FF07C1D1: (now the
machine is dead)
I know that the device mapper was tested in the past. Was this scenario
tested? Is anyone able to reproduce this behavior? I was able to
reproduce it 5 times in a row.
Thanks
BTW - a very similar test without the device mapper works ok. I've
disconnected /dev/sda, tried running a dd command using /dev/sda, then I
reconnected /dev/sda and the command completed. When I stopped the
initiator, the machine didn't hang.
Erez
--
____________________________________________________________
Erez Zilber | 972-9-971-7689
Software Engineer, Storage Team
Voltaire – _The Grid Backbone_
__
www.voltaire.com <http://www.voltaire.com/>
--
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel