On 01/02/14 01:35 PM, nik600 wrote:
Dear all
i need some clarification about clustering with rhel 6.4
i have a cluster with 2 node in active/passive configuration, i simply
want to have a virtual ip and migrate it between 2 nodes.
i've noticed that if i reboot or manually shut down a node the failover
works correctly, but if i power-off one node the cluster doesn't
failover on the other node.
Another stange situation is that if power off all the nodes and then
switch on only one the cluster doesn't start on the active node.
I've read manual and documentation at
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html
and i've understand that the problem is related to fencing, but the
problem is that my 2 nodes are on 2 virtual machine , i can't control
hardware and can't issue any custom command on the host-side.
I've tried to use fence_xvm but i'm not sure about it because if my VM
has powered-off, how can it reply to fence_vxm messags?
Here my logs when i power off the VM:
==> /var/log/cluster/fenced.log <==
Feb 01 18:50:22 fenced fencing node mynode02
Feb 01 18:50:53 fenced fence mynode02 dev 0.0 agent fence_xvm result:
error from agent
Feb 01 18:50:53 fenced fence mynode02 failed
I've tried to force the manual fence with:
fence_ack_manual mynode02
and in this case the failover works properly.
The point is: as i'm not using any shared filesystem but i'm only
sharing apache with a virtual ip, i won't have any split-brain scenario
so i don't need fencing, or not?
So, is there the possibility to have a simple "dummy" fencing?
here is my config.xml:
<?xml version="1.0"?>
<cluster config_version="20" name="hacluster">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="0"/>
<cman expected_votes="1" two_node="1"/>
<clusternodes>
<clusternode name="mynode01" nodeid="1" votes="1">
<fence>
<method name="mynode01">
<device domain="mynode01"
name="mynode01"/>
</method>
</fence>
</clusternode>
<clusternode name="mynode02" nodeid="2" votes="1">
<fence>
<method name="mynode02">
<device domain="mynode02"
name="mynode02"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice agent="fence_xvm" name="mynode01"/>
<fencedevice agent="fence_xvm" name="mynode02"/>
</fencedevices>
<rm log_level="7">
<failoverdomains>
<failoverdomain name="MYSERVICE" nofailback="0"
ordered="0" restricted="0">
<failoverdomainnode name="mynode01"
priority="1"/>
<failoverdomainnode name="mynode02"
priority="2"/>
</failoverdomain>
</failoverdomains>
<resources/>
<service autostart="1" exclusive="0" name="MYSERVICE"
recovery="relocate">
<ip address="192.168.1.239" monitor_link="on"
sleeptime="2"/>
<apache config_file="conf/httpd.conf" name="apache"
server_root="/etc/httpd" shutdown_wait="0"/>
</service>
</rm>
</cluster>
Thanks to all in advance.
The fence_virtd/fence_xvm agent works by using multicast to talk to the
VM host. So the "off" confirmation comes from the hypervisor, not the
target.
Depending on your setup, you might find better luck with fence_virsh (I
have to use this as there is a known multicast issue with Fedora hosts).
Can you try, as a test if nothing else, if 'fence_virsh' will work for you?
fence_virsh -a <host ip> -l root -p <host root pw> -n <virsh name for
target vm> -o status
If this works, it should be trivial to add to cluster.conf. If that
works, then you have a working fence method. However, I would recommend
switching back to fence_xvm if you can. The fence_virsh agent is
dependent on libvirtd running, which some consider a risk.
hth
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster