I have two node cluster .
/etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="saza" config_version="38" name="saza">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1.network.com" nodeid="1" votes="1">
<fence/>
</clusternode>
<clusternode name="node2.network.com" nodeid="2" votes="1">
<fence/>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="cenall" ordered="1" restricted="1">
<failoverdomainnode name="node1.network.com" priority="1"/>
<failoverdomainnode name="node2.network.com" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<script file="/etc/init.d/httpd" name="httpd"/>
<fs device="/dev/sda3" force_fsck="0" force_unmount="0" fsid="6443" fstype="ext3" mountpoint="/var/www/html" name="httpd-content" options="" self_fence="0"/>
<ip address="10.28.99.81" monitor_link="1"/>
</resources>
<service autostart="1" domain="cenall" name="webserver">
<script ref="httpd"/>
<fs ref="httpd-content"/>
<ip ref="10.28.99.81"/>
</service>
</rm>
</cluster>
<cluster alias="saza" config_version="38" name="saza">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1.network.com" nodeid="1" votes="1">
<fence/>
</clusternode>
<clusternode name="node2.network.com" nodeid="2" votes="1">
<fence/>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="cenall" ordered="1" restricted="1">
<failoverdomainnode name="node1.network.com" priority="1"/>
<failoverdomainnode name="node2.network.com" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<script file="/etc/init.d/httpd" name="httpd"/>
<fs device="/dev/sda3" force_fsck="0" force_unmount="0" fsid="6443" fstype="ext3" mountpoint="/var/www/html" name="httpd-content" options="" self_fence="0"/>
<ip address="10.28.99.81" monitor_link="1"/>
</resources>
<service autostart="1" domain="cenall" name="webserver">
<script ref="httpd"/>
<fs ref="httpd-content"/>
<ip ref="10.28.99.81"/>
</service>
</rm>
</cluster>
when i restart network on node2.
service network shutdown
service network start.
and then
on node2
i can't start cman .
start fencing ... failed
Message log on node2
openais[22433]: [SYNC ] Not using a virtual synchrony filter.
Jan 12 02:05:02 clus2 groupd[22442]: found uncontrolled kernel object rgmanager in /sys/kernel/dlm
Jan 12 02:05:02 clus2 openais[22433]: [TOTEM] Creating commit token because I am the rep.
Jan 12 02:05:02 clus2 groupd[22442]: local node must be reset to clear 1 uncontrolled instances of gfs and/or dlm
Jan 12 02:05:02 clus2 openais[22433]: [TOTEM] Saving state aru 0 high seq received 0
Jan 12 02:05:02 clus2 fence_node[22467]: Fence of "node2.network.com" was unsuccessful
Jan 12 02:05:02 clus2 openais[22433]: [TOTEM] entering COMMIT state.
Jan 12 02:05:02 clus2 gfs_controld[22460]: groupd_dispatch error -1 errno 104
Jan 12 02:05:02 clus2 dlm_controld[22454]: groupd is down, exiting
Jan 12 02:05:02 clus2 fenced[22448]: groupd is down, exiting
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] entering RECOVERY state.
Jan 12 02:05:03 clus2 gfs_controld[22460]: groupd connection died
Jan 12 02:05:03 clus2 kernel: dlm: closing connection to node 2
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] position [0] member 192.168.100.2:
Jan 12 02:05:03 clus2 gfs_controld[22460]: cluster is down, exiting
Jan 12 02:05:03 clus2 kernel: dlm: closing connection to node 1
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] previous ring seq 0 rep 192.168.100.2
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] aru 0 high delivered 0 received flag 0
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] Did not need to originate any messages in recovery.
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] Storing new sequence id for ring 4
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] Sending initial ORF token
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] CLM CONFIGURATION CHANGE
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] New Configuration:
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] Members Left:
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] Members Joined:
Jan 12 02:05:03 clus2 openais[22433]: [SYNC ] This node is within the primary component and will provide service.
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] CLM CONFIGURATION CHANGE
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] New Configuration:
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] r(0) ip(192.168.100.2)
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] Members Left:
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] Members Joined:
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] r(0) ip(192.168.100.2)
Jan 12 02:05:03 clus2 openais[22433]: [SYNC ] This node is within the primary component and will provide service.
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] entering OPERATIONAL state.
Jan 12 02:05:03 clus2 openais[22433]: [CMAN ] quorum regained, resuming activity
Jan 12 02:05:03 clus2 openais[22433]: [CLM ] got nodejoin message 192.168.100.2
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] entering GATHER state from 11.
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] Saving state aru 9 high seq received 9
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] entering COMMIT state.
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] entering RECOVERY state.
Jan 12 02:05:03 clus2 openais[22433]: [TOTEM] position [0] member 192.168.100.1:
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] previous ring seq 132 rep 192.168.100.1
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] aru c high delivered c received flag 0
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] position [1] member 192.168.100.2:
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] previous ring seq 4 rep 192.168.100.2
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] aru 9 high delivered 9 received flag 0
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] Did not need to originate any messages in recovery.
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] Storing new sequence id for ring 88
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] CLM CONFIGURATION CHANGE
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] New Configuration:
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] r(0) ip(192.168.100.2)
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] Members Left:
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] Members Joined:
Jan 12 02:05:04 clus2 openais[22433]: [SYNC ] This node is within the primary component and will provide service.
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] CLM CONFIGURATION CHANGE
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] New Configuration:
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] r(0) ip(192.168.100.1)
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] r(0) ip(192.168.100.2)
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] Members Left:
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] Members Joined:
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] r(0) ip(192.168.100.1)
Jan 12 02:05:04 clus2 openais[22433]: [SYNC ] This node is within the primary component and will provide service.
Jan 12 02:05:04 clus2 openais[22433]: [TOTEM] entering OPERATIONAL state.
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] got nodejoin message 192.168.100.1
Jan 12 02:05:04 clus2 openais[22433]: [CLM ] got nodejoin message 192.168.100.2
Jan 12 02:05:04 clus2 openais[22433]: [CPG ] got joinlist message from node 1
Jan 12 02:05:04 clus2 openais[22433]: [CMAN ] cman killed by node 2 for reason 2
Jan 12 02:05:32 clus2 ccsd[22427]: Unable to connect to cluster infrastructure after 30 seconds.
but on node1 can restart cman. but message log show
fence node2.network.com failed.
how to join member again ?
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster