Re: fencing loop in a 2-node partitioned cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



unfortunately, doing this seems to have a problematic side effect.
Set up -f 1 on one node and -f 10 on the other.
Now if I panic one node, it is fenced by the other one, but when
restarting it remains in
start fencing....
till it forms after some minute an own cluster and kills cman on the other node
(the same problem as in bugzilla 485026 that you well know...)
Tried two times and as soon as I get rid of -f option in both the cman
init scripts the situation come back ok with the panic scenario
So I thinnk I will remain with default and with possible complete
cluster offline in case of total loss of intranetwork
(that is bonded, but in case of operations on VLAN that comprimeses it
I can still get the problem...)
But it is a pity that it could not scale down to production network if
heartbeat goes down (even kimberlite was able to do this..)
Or better, the quorum master should win and fence the other, or the
fencing should be service based....
There is something about this in the FAQ but it seems not so easy to
configure and have...

;-(

On Tue, Feb 24, 2009 at 8:09 PM, Marc Grimme <grimme@xxxxxxx> wrote:

> This time you're lucky cause it's just a fenced option:
>
> [root@generix2 ~]# fenced -h
> Usage:
>
> fenced [options]
>
> Options:
>
>  -c           All nodes are in a clean state to start
>  -j <secs>     Post-join fencing delay (default 6)
>  -f <secs>     Post-fail fencing delay (default 0)
>  -O <path>    Override path (default /var/run/cluster/fenced_override)
>  -D           Enable debugging code and don't fork
>  -h           Print this help, then exit
>  -V           Print program version information, then exit
>
> Command line values override those in cluster.conf.
> For an unbounded delay use <secs> value of -1.
>
> And you don't want to change it for all nodes the same. So add this (-f )
> option to the /etc/init.d/cman initscript in the function start_daemons to
> fenced. As there is no variable like FENCED_FAIL_DELAY you have to change the
> script ;( .
>
> Marc.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux