Re: openais question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The problem was the firewall iptables!

I've done as told by the FAQ http://sources.redhat.com/cluster/faq.html#iptables
Node 2 iptables.conf:
#       rgmanager/clurgmgrd
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 41966:41969 -j ACCEPT
#       ccsd
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 50006 -j ACCEPT
-A SERVICOS -p udp -m udp -s node1-IPAddr --dport 50007 -j ACCEPT
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 50008:50009 -j ACCEPT
#       dlm
-A SERVICOS -p tcp -m tcp -s node1-IPAddr --dport 21064 -j ACCEPT
#       openais
-A SERVICOS -p udp -m udp -s node1-IPAddr --dport 5405 -j ACCEPT
-A SERVICOS -j RETURN

But when these rules are enabled, my previous email explains the problems.

Does any of you know what am I missing in my iptables.conf?

Thanks,

Pedro Bandim Faustino
email/sip: pedro.faustino@xxxxxxx



Pedro Bandim Faustino wrote:
Other way to observe the same (maybe other reason??):

After booting both nodes, when doing a service cman start on both nodes with some seconds of interval between both commands, the two nodes join the cluster and get quored. When doing service cman stop on both nodes (also with some seconds of interval between both commands), one of the nodes successfully leaves the cluster, but the other prints this out

[root@m07 ~]# service cman stop
Stopping cluster:
  Stopping fencing... done
  Stopping cman... failed
Timed-out waiting for cluster
                                                          [FAILED]

while the messages in the log are
Dec  7 15:27:17 m07 openais[5436]: [TOTEM] The consensus timeout expired.
Dec  7 15:27:17 m07 openais[5436]: [TOTEM] entering GATHER state from 3.
Dec  7 15:27:32 m07 openais[5436]: [TOTEM] The consensus timeout expired.
Dec  7 15:27:32 m07 openais[5436]: [TOTEM] entering GATHER state from 3.
Dec  7 15:27:47 m07 openais[5436]: [TOTEM] The consensus timeout expired.
Dec  7 15:27:47 m07 openais[5436]: [TOTEM] entering GATHER state from 3.

Do you know what the problem is?

output of ps fax:
....
5430 ?        Ssl    0:00 /sbin/ccsd
5436 ?        SLl    0:03 aisexec
5450 ?        Ss     0:00 /sbin/groupd
5458 ?        Ss     0:00 /sbin/fenced
5464 ?        Ss     0:00 /sbin/dlm_controld
5470 ?        Ss     0:00 /sbin/gfs_controld
...

cluster.conf
<?xml version="1.0"?>
<cluster name="VoIP_RCTS" config_version="8">

<!-- The quorum disk solves the imbalance caused by this two-node cluster -->
<cman two_node="0" expected_votes="3">
</cman>

<!-- Change logging from /var/log/messages to /log/cluster/cluster.log -->
<rm log_level="6" log_facility="local4">
</rm>

<fence_daemon post_join_delay="10"/>

<clusternodes>
       <clusternode name="m07.<whatever>" votes="1" nodeid="1">
               <fence>
<method name="1"><device name="fence_bladecenter-VoIP_RCTS-cluster" blade="7"/></method>
               </fence>
       </clusternode>
       <clusternode name="m08.<whatever>" votes="1" nodeid="2">
               <fence>
<method name="1"><device name="fence_bladecenter-VoIP_RCTS-cluster" blade="8"/></method>
               </fence>
       </clusternode>
</clusternodes>

<fencedevices>
<fencedevice name="fence_bladecenter-VoIP_RCTS-cluster" agent="fence_bladecenter" ipaddr="192.168.0.1" login="<login>" password="<password>"/>
</fencedevices>

<!-- Specify here the shared quorum disk -->
<quorumd label="QUORUM-VoIP" votes="1"/></cluster>




Pedro Bandim Faustino
email/sip: pedro.faustino@xxxxxxx

FCCN - Fundação para a Computação Científica Nacional
Av. do Brasil, n.º 101
1700-066 Lisboa
Tel: +351 21 844 0100
Fax: +351 21 847 2167
www.fccn.pt

Aviso de Confidencialidade

Esta mensagem é exclusivamente destinada ao seu destinatário, podendo conter informação CONFIDENCIAL, cuja divulgação está expressamente vedada nos termos da lei. Caso tenha recepcionado indevidamente esta mensagem, solicitamos-lhe que nos comunique esse mesmo facto por esta via ou para o telefone +351 218440100 devendo apagar o seu conteúdo de imediato. This message is intended exclusively for its addressee. It may contain CONFIDENTIAL information protected by law. If this message has been received by error, please notify us via e-mail or by telephone +351 218440100 and delete it immediately.



Pedro Bandim Faustino wrote:
Hi All,

I've a running cluster (v2.01.00) with two Fedora7 nodes. While testing I've disabled all the NICs in one node. I started observing these messages on the other node:

Dec 7 13:05:55 m07 openais[4233]: [TOTEM] The consensus timeout expired.
Dec  7 13:05:55 m07 openais[4233]: [TOTEM] entering GATHER state from 3.
Dec 7 13:06:10 m07 openais[4233]: [TOTEM] The consensus timeout expired.
Dec  7 13:06:10 m07 openais[4233]: [TOTEM] entering GATHER state from 3.
Dec 7 13:06:25 m07 openais[4233]: [TOTEM] The consensus timeout expired.
Dec  7 13:06:25 m07 openais[4233]: [TOTEM] entering GATHER state from 3.

When I enabled the NICs and network was restored the same messages kept appearing, now on both nodes.
I've searched but couldn't find an answer/explanation.

Thanks for your help,

------------------------------------------------------------------------

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

------------------------------------------------------------------------

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux