* James Fillman
I'm running RHEL5 with QLogic HBA's and a Sun 6140 SAN. The host type I'm using for my servers is 'Solaris (with DMP)'. This turns AVT mode one. For some reason, the two controllers are returning the same priority value to my priority call out program (mpath_prio_tpc).
Try path_grouping_policy group_by_serial?
Can anyone briefly explain what the mpath_prio_tpc utility does and where these priority values come from?
It calls out to the supplied SCSI device and returns a different integer depending on if the controller is the preferred owner for the volume and if it actually is. The values has changed quite a bit from version to version, but if I recall correctly they are in your version: Device is a path to the preferred and active controller = 6 Device is a path to the preferred and inactive controller = 4 Device is a path to the least preferred and active controller = 3 Device is a path to the least preferred and inactive controller = 1 Since you have group_by_prio and they end up in the same path group it seems the prio callout doesn't work (you can test this yourself by running «mpath_prio_tpc /dev/sdx»). When I used AVT I set the hosts to host type AIX_FO, which seemed to work fine for me at least. Try that? You're also using path_checker readsector0 with AVT which is really bad. Every time multipathd checks a path to the passive controller the volume vil move there, which will interrupt I/O for several seconds. Use the tur checker instead. However, are you _really_ sure you want AVT mode? Support for RDAC mode was recently added to dm-multipath (both a hardware handler and a path checker), and using this is normally vastly superior to AVT mode. The Linux kernel itself (partition scan on boot) as well as a lot of applications (LVM, mdadm, fdisk, etc.) believe reading from block devices is a harmless thing to do. However with AVT mode this I/O will cause a volume to transfer and I/O to be interrupted, and paths will fail. With AVT you won't be able to boot node A in a cluster without interrupting the I/O flow of the rest of the nodes, nor manage LVM on any node in a cluster without also interrupting I/O (unless you've configured LVM to stay away from your multipathed devices). PS: If you have I/O failures happening on your hosts when you do changes to the storage domains setup, this is due to a mis-feature in the 6140 that makes it dim its fibre ports whenever a change is made. They said this was done to make all hosts relogin and automatically discover new volumes with no manual rescan needed, but the fabric relogin interrupts I/O and causes path failures. But there is hope. I was told yesterday that in the next firmware release we can opt to toggle this «feature» off. Go pester Sun to get it. ;-) Regards -- Tore Anderson -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel