Hi Lon,
thanks for your reply.
Unfortunately I'm currently not in a position to test this at the
moment. I really should get myself a proper test setup :-(
Before I try the next time, I have one small detail to clarify here.
Lon Hohberger wrote:
On Wed, May 02, 2007 at 02:44:06PM +0100, Frederik Ferner wrote:
With the new version of qdiskd it seems the heuristics are not tested
anymore after it reaches a sufficient score once. When the outside
network is lost qdiskd on both server still claim the same score in the
status file and both servers report the votes for the qdisk to cman.
Hmm, could you add 'tko="1"' to your cluster.conf for the heuristics? I
^^^^^^^
wonder if it's an initialization problem.
If qdiskd is started while the outside network is unreachable the scores
start without the scores for the failing heuristics. Once network is
restored the score jumps to at least the minimum required for operation
and once again stays there.
This seems to work for me:
[10538] debug: Heuristic: 'ping 192.168.79.254 -c1 -t3' missed (1/3)
[10538] debug: Heuristic: 'ping 192.168.79.254 -c1 -t3' missed (2/3)
[10538] info: Heuristic: 'ping 192.168.79.254 -c1 -t3' DOWN (3/3)
[10537] notice: Score insufficient for master operation (0/11;
required=6); downgrading
Message from syslogd@green at Mon May 7 10:36:43 2007 ...
green clurgmgrd[7305]: <emerg> #1: Quorum Dissolved
(machine rebooted)
[snip]
Hmm, try adding tko="3" to each of your ping heuristics, like this:
^^^^^^^
Is this the same suggestion as above (tko="1")? In any case I'll try
that next time I get a chance.
Many thanks,
Frederik
--
Frederik Ferner
Linux Systems Administrator phone: +44 1235 77 8624
Diamond Light Source Ltd. mob: +44 7917 08 5110
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster