GoodMornings In the meantime we did an upgrade on RHEL to 5.5 and multipath now looks more accurate showing only 1path per HBA. We have a 2datacenter setup with 4Fabrics between them. 2Fabrics for each datacenter. mpath-dc2-a (360060e8004f240000000f24000000502) dm-12 HITACHI,OPEN-V -SU [size=26G][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:1:0 sdg 8:96 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 5:0:1:0 sdo 8:224 [active][ready] I'll repeat the tests and look at the state you're saying I'm using group_by_node_name because before with 8links it was a mess, but it spreads some load between the paths, but not on all of them. anyway that was it the "strange" paths i'll see how it goes now Thanks Jose > Hi Jose, > > You have a total of 8 paths per LUN, 4 are marked active thru HBA host5 > and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet > FABRICS right ?) , this may due to the fact that you use policy > group_by_node_name. I don't know if this mode if it actually load > balances across the 2 HBA's. > > > When you pull the cable (this is the test that you're doing and that s > failling ?) you say it times out forever. > As you're in policy group_by_node_name, which corresponds to the > fc_transport target node name you should look at the state of the target > ports bound to the HBA you disconnected (is it the test you're doing?) > (state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is > your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too > high (both timers are located under /sys/class/fc_remote_ports/rport.... > > I have almost the same setup with almost the same storage (OPEN-V) from > a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use > maximum 4 paths per LUN (2 per fabric), some storage experts tend to say > it is already too much, and as multipath policy I use multibus to > distribute across the 2 fabrics. > > Hope all this will help > > > > > > > > you say this happens when you pull the fiber cable from the server > > On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote: >> Hi >> >> >> > Can you show us a pvdisplay or verbose vgdisplay ? >> > >> >> Here goes the vgdisplay -v of one of the vgs with mirrors >> >> ########################################################### >> >> --- Volume group --- >> VG Name vg_ora_jura >> System ID >> Format lvm2 >> Metadata Areas 3 >> Metadata Sequence No 705 >> VG Access read/write >> VG Status resizable >> MAX LV 0 >> Cur LV 4 >> Open LV 4 >> Max PV 0 >> Cur PV 3 >> Act PV 3 >> VG Size 52.79 GB >> PE Size 4.00 MB >> Total PE 13515 >> Alloc PE / Size 12292 / 48.02 GB >> Free PE / Size 1223 / 4.78 GB >> VG UUID nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_arch >> VG Name vg_ora_jura >> LV UUID 8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 5.00 GB >> Current LE 1280 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:28 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_export >> VG Name vg_ora_jura >> LV UUID NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2 >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 5.00 GB >> Current LE 1280 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:32 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_data >> VG Name vg_ora_jura >> LV UUID VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 12.00 GB >> Current LE 3072 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:40 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_redo >> VG Name vg_ora_jura >> LV UUID KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 2.00 GB >> Current LE 512 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:48 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0 >> VG Name vg_ora_jura >> LV UUID lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 5.00 GB >> Current LE 1280 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:26 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1 >> VG Name vg_ora_jura >> LV UUID snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3 >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 5.00 GB >> Current LE 1280 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:27 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mlog >> VG Name vg_ora_jura >> LV UUID ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1 >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 4.00 MB >> Current LE 1 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:25 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mlog >> VG Name vg_ora_jura >> LV UUID TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9 >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 4.00 MB >> Current LE 1 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:37 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_0 >> VG Name vg_ora_jura >> LV UUID 8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 12.00 GB >> Current LE 3072 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:38 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_1 >> VG Name vg_ora_jura >> LV UUID fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 12.00 GB >> Current LE 3072 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:39 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mlog >> VG Name vg_ora_jura >> LV UUID 29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 4.00 MB >> Current LE 1 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:29 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_0 >> VG Name vg_ora_jura >> LV UUID 1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4 >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 5.00 GB >> Current LE 1280 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:30 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_1 >> VG Name vg_ora_jura >> LV UUID cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 5.00 GB >> Current LE 1280 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:31 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mlog >> VG Name vg_ora_jura >> LV UUID 811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 4.00 MB >> Current LE 1 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:45 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0 >> VG Name vg_ora_jura >> LV UUID aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 2.00 GB >> Current LE 512 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:46 >> >> --- Logical volume --- >> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1 >> VG Name vg_ora_jura >> LV UUID gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6 >> LV Write Access read/write >> LV Status available >> # open 1 >> LV Size 2.00 GB >> Current LE 512 >> Segments 1 >> Allocation inherit >> Read ahead sectors auto >> - currently set to 256 >> Block device 253:47 >> >> --- Physical volumes --- >> PV Name /dev/mapper/mpath-dc1-b >> PV UUID hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8 >> PV Status allocatable >> Total PE / Free PE 6749 / 605 >> >> PV Name /dev/mapper/mpath-dc2-b >> PV UUID hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR >> PV Status allocatable >> Total PE / Free PE 6749 / 605 >> >> PV Name /dev/mapper/mpath-dc2-mlog1p1 >> PV UUID 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ >> PV Status allocatable >> Total PE / Free PE 17 / 13 >> >> >> >> > On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote: >> >> hellos >> >> >> >> I spent more time on this and it seems since LVM cant write to any pv >> on >> >> the volumes it has lost, it cannot write the failure of the devices >> and >> >> update the metadata on other PVs. So it hangs forever >> >> >> >> Is this right? >> >> >> >>> GoodMornings >> >>> >> >>> This is what I have on multipath.conf >> >>> >> >>> blacklist { >> >>> wwid SSun_VOL0_266DCF4A >> >>> wwid SSun_VOL0_5875CF4A >> >>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" >> >>> devnode "^hd[a-z]" >> >>> } >> >>> defaults { >> >>> user_friendly_names yes >> >>> } >> >>> devices { >> >>> device { >> >>> vendor "HITACHI" >> >>> product "OPEN-V" >> >>> path_grouping_policy group_by_node_name >> >>> failback immediate >> >>> no_path_retry fail >> >>> } >> >>> device { >> >>> vendor "IET" >> >>> product "VIRTUAL-DISK" >> >>> path_checker tur >> >>> path_grouping_policy failover >> >>> failback immediate >> >>> no_path_retry fail >> >>> } >> >>> } >> >>> >> >>> As an example this is one LUN. It shoes [features=0] so I'd say it >> >>> should >> >>> fail right way >> >>> >> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V >> >>> -SU >> >>> [size=26G][features=0][hwhandler=0][rw] >> >>> \_ round-robin 0 [prio=4][active] >> >>> \_ 5:0:1:0 sdu 65:64 [active][ready] >> >>> \_ 5:0:1:16384 sdac 65:192 [active][ready] >> >>> \_ 5:0:1:32768 sdas 66:192 [active][ready] >> >>> \_ 5:0:1:49152 sdba 67:64 [active][ready] >> >>> \_ round-robin 0 [prio=4][enabled] >> >>> \_ 3:0:1:0 sdaw 67:0 [active][ready] >> >>> \_ 3:0:1:16384 sdbe 67:128 [active][ready] >> >>> \_ 3:0:1:32768 sdbi 67:192 [active][ready] >> >>> \_ 3:0:1:49152 sdbm 68:0 [active][ready] >> >>> >> >>> It think they fail since I see this messages from LVM: >> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in >> >>> vg_syb_roger-lv_syb_roger_admin >> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty >> devices >> >>> in >> >>> vg_syb_roger-lv_syb_roger_admin >> >>> >> >>> But from some reason LVM cant remove them, any option I should have >> on >> >>> lvm.conf? >> >>> >> >>> BestRegards >> >>> Jose >> >>>> post your multipath.conf file, you may be queuing forever ? >> >>>> >> >>>> >> >>>> >> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote: >> >>>>> Hi2all >> >>>>> >> >>>>> I'm on RHEL 5.4 with >> >>>>> lvm2-2.02.46-8.el5_4.1 >> >>>>> 2.6.18-164.2.1.el5 >> >>>>> >> >>>>> I have a multipathed SAN connection with what Im builing LVs >> >>>>> Its a Cluster system, and I want LVs to switch on failure >> >>>>> >> >>>>> If I simulate a fail through the OS via >> >>>>> /sys/bus/scsi/devices/$DEVICE/delete >> >>>>> I get a LV fail and the service switch to other node >> >>>>> >> >>>>> But if I do it "real" portdown on the SAN Switch, multipath >> reports >> >>>>> path >> >>>>> down, but LVM commands hang forever and nothing gets switched >> >>>>> >> >>>>> from the logs i see multipath failing paths, and lvm Failed to >> remove >> >>>>> faulty >> >>>>> "devices" >> >>>>> >> >>>>> Any ideas how I should "fix" it? >> >>>>> >> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has >> >>>>> failed. >> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in >> >>>>> vg_ora_scapa-lv_ora_scapa_redo >> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling >> an >> >>>>> event. Waiting... >> >>>>> >> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining >> active >> >>>>> paths: 0 >> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining >> active >> >>>>> paths: 0 >> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining >> active >> >>>>> paths: 0 >> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining >> active >> >>>>> paths: 0 >> >>>>> >> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in >> >>>>> vg_syb_roger-lv_syb_roger_admin >> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty >> >>>>> devices >> >>>>> in >> >>>>> vg_syb_roger-lv_syb_roger_admin >> >>>>> >> >>>>> Much Thanks >> >>>>> Jose >> >>>>> >> >>>>> _______________________________________________ >> >>>>> linux-lvm mailing list >> >>>>> linux-lvm@redhat.com >> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm >> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> linux-lvm mailing list >> >>>> linux-lvm@redhat.com >> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm >> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> >>>> >> >>> >> >>> >> >> >> >> _______________________________________________ >> >> linux-lvm mailing list >> >> linux-lvm@redhat.com >> >> https://www.redhat.com/mailman/listinfo/linux-lvm >> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> >> >> > >> > -- >> > Sent from my mobile device >> > >> > Regards, >> > Eugene Vilensky >> > evilensky@gmail.com >> > >> > _______________________________________________ >> > linux-lvm mailing list >> > linux-lvm@redhat.com >> > https://www.redhat.com/mailman/listinfo/linux-lvm >> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> > >> >> _______________________________________________ >> linux-lvm mailing list >> linux-lvm@redhat.com >> https://www.redhat.com/mailman/listinfo/linux-lvm >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ > > > _______________________________________________ > linux-lvm mailing list > linux-lvm@redhat.com > https://www.redhat.com/mailman/listinfo/linux-lvm > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ > _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/