yesterday I had a problem where a server with 2 Qlogic HBA detected only 2 of the 4 paths it should see.
Each HBA is connected to a different fc switch.
Each FC switch is connected to the 2 controllers of an IBM DS6800 storage array.
So that in general each disk is seen by 4 paths.
System is rh el 5.5 x86_64
Something like this for each mpath device normally:
mpath1 (3600507630efe0b0c0000000000000601) dm-8 IBM,1750500
[size=15G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 1:0:3:2 sdao 66:128 [active][undef]
\_ 2:0:3:2 sdaq 66:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:2 sdd 8:48 [active][undef]
\_ 2:0:2:2 sdp 8:240 [active][undef]
Yesterday one server was able to see only 2 paths for each mpath device.
I had something like this:
mpath1 (3600507630efe0b0c0000000000000601) dm-8 IBM,1750500
[size=15G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 1:0:3:2 sdao 66:128 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:2 sdd 8:48 [active][undef]
I had some problems to identify PCI ID <--> WWPN <--> disk devices associations so that I could give the correct wwpn involved in the problem to the SAN guys for analysis and resolution.
I took these steps. Could anyone confirm they are ok or provide other analysys information?
a) From above output of "multipath -l", and also from output of "lsscsi" command
..
[1:0:2:2] disk IBM 1750500 .508 /dev/sdd
[1:0:2:3] disk IBM 1750500 .508 /dev/sde
..
[1:0:3:0] disk IBM 1750500 .508 /dev/sdal
[1:0:3:1] disk IBM 1750500 .508 /dev/sdan
[1:0:3:2] disk IBM 1750500 .508 /dev/sdao
with only lines of kind "1:0:x:x" I concluded that I had one adapter correctly seeing its 2 paths and the other one seeing nothing.
b) In /var/log/messages I had this, only for one adapter
Jul 27 17:54:58 orastud1 kernel: qla2xxx 0000:06:00.0: SNS scan failed -- assuming zero-entry result...
that confirmed somehow a) conclusion
Now I had to give wwpn of hba with pci id 0000:06:00.0
I found that this was not so obvious (at least for me)
c) to see my two wwpn:
# for i in 1 2 ; do echo "host$i $(cat /sys/class/fc_host/host$i/port_name)"; done
host1 0x21000024ff288e04
host2 0x21000024ff288e05
ok: how to directly connect hostx with pci 0000:06:00.0 ?
lspci gives:
06:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02)
06:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02)
but no "-vv" switch gives the wwpn for them
Can I surely say that host1 <--> 06:00.0 and host2 <--> 06:00.1 ?
What if the first number in general is not the same (for example first hba 05:00.0 and second 06:00.0)?
d) Basically I went through something like:
# ls -d /sys/class/fc_transport/target*/device/*/block* | grep sdao
/sys/class/fc_transport/target1:0:3/device/1:0:3:2/block:sdao
# ls -d /sys/class/fc_transport/target*/device/*/block* | grep sdd
/sys/class/fc_transport/target1:0:2/device/1:0:2:2/block:sdd
Can I say that host1 <--> target1 ?
And so that in my case the adapter not seeing the LUNs was host2 ----> wpn 0x21000024ff288e05 ?
Could I directly say from point a) that the hostx involved is host2 as "multipath -l" only shows 1:x:y:z devices?
And so that problematic wpn was 0x21000024ff288e05?
Anyway, how to associate hostx with pci id?
Thanks in advance,
Gianluca
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel