Hi,
finally I had a chance to experiment with the test rpms for cman[1] that
should solve the problem with multiple master I had...
For these tests I was using the following rpms on RHEL4U4:
kernel-smp-2.6.9-42.0.3.EL
cman-kernel-smp-2.6.9-45.8.1TEST
cman-1.0.11-0.4.1qdisk
rgmanager-1.9.54-1
To test this I have two server connected to one switch with nothing else
connected and one uplink. As heuristics for qdiskd I'm pinging a few IP
addresses outside of this switch. When I unplug the uplink with the old
cman installed, qdiskd on both servers immediately notice this and lower
the score accordingly.
With the new version of qdiskd it seems the heuristics are not tested
anymore after it reaches a sufficient score once. When the outside
network is lost qdiskd on both server still claim the same score in the
status file and both servers report the votes for the qdisk to cman.
If qdiskd is started while the outside network is unreachable the scores
start without the scores for the failing heuristics. Once network is
restored the score jumps to at least the minimum required for operation
and once again stays there.
Is this a bug that will be fixed in the upcoming RHEL4U5 release or
could there be something else wrong with my setup?
Here's my quorumd section from cluster.conf
-----
<quorumd interval="1" tko="5" votes="3" log_level="9"
log_facility="local4" status_file="/tmp/qdisk_status"
device="/dev/emcpowerq1">
<heuristic program="ping 172.23.4.254 -c1 -t1" score="1" interval="2"/>
<heuristic program="ping 130.246.8.13 -c1 -t3" score="1" interval="2"/>
<heuristic program="ping 130.246.72.21 -c1 -t3" score="1" interval="2"/>
<heuristic program="ping 172.23.5.120 -c1 -t1" score="1" interval="2"/>
<heuristic program="ping 172.23.6.229 -c1 -t1" score="1" interval="2"/>
<heuristic program="ping 172.23.7.34 -c1 -t1" score="1" interval="2"/>
<heuristic program="ping 172.23.7.35 -c1 -t1" score="1" interval="2"/>
<heuristic program="ping 172.23.6.233 -c1 -t1" score="1" interval="2"/>
</quorumd>
-----
If you need any more information, I happy to provide this.
Kind regards,
Frederik
[1] http://people.redhat.com/lhh/packages.html
--
Frederik Ferner
Linux Systems Administrator phone: +44 1235 77 8624
Diamond Light Source Ltd. mob: +44 7917 08 5110
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster