Since you are using manual fencing, did you run fence_ack_manual after killing the machine? When the machine comes back up, the ack is implied, but rgmanager will not be able to perform recovery operations until fencing is complete. Suggest you utilize a real fencing agent if you want this to work seamlessly. Kevin On Thu, 2008-08-14 at 10:25 -0400, Chris Edwards wrote: > I have been trying to simulate a xen VM failover, I have a 2 machine > cluster and 2 vm’s running. If I issue a “ xm destroy ID” the vm > will automatically reboot to the other node. But if I reboot one of > the clusters to simulate a machine failure the vm never boots back up > until the other machine comes online. So here are my questions… > > > > 1. How do I get the cluster to boot the vm that has failed when > one of the clustered machines are down? > > 2. When I do a “xm destroy ID” the cluster always reboots the vm > onto the other cluster machine, is there any way for me to have it > boot back to the machine its supposed to be running on without having > to do a manual migrate? Can It auto-migrate back to its original > machine over time? > > > > > > Here is the out put of my clustat during a reboot of one of the > clusters… > > > > Cluster Status for Xen @ Thu Aug 14 10:11:21 2008 > > Member Status: Quorate > > Member Name ID Status > > ------ ---- ---- ------ > > xen1.smartechcorp.net 1 Online, Local, > rgmanager > > xen2.smartechcorp.net 2 Offline > > Service Name Owner (Last) > State > > ------- ---- ----- ------ > ----- > > vm:Linux1 xen2.smartechcorp.net > stopping > > vm:Windows1 xen1.smartechcorp.net > started > > > > Here is my cluster.conf…. > > > > <?xml version="1.0"?> > > <cluster alias="Xen" config_version="29" name="Xen"> > > <fence_daemon clean_start="0" post_fail_delay="0" > post_join_delay="-1"/> > > <clusternodes> > > <clusternode name="xen1.smartechcorp.net" nodeid="1" > votes="1"> > > <fence> > > <method name="1"> > > <device name="manual" > nodename="xen1.smartechcorp.net"/> > > </method> > > </fence> > > </clusternode> > > <clusternode name="xen2.smartechcorp.net" nodeid="2" > votes="1"> > > <fence> > > <method name="1"> > > <device name="manual" > nodename="xen2.smartechcorp.net"/> > > </method> > > </fence> > > </clusternode> > > </clusternodes> > > <cman expected_votes="1" two_node="1"/> > > <fencedevices> > > <fencedevice agent="fence_manual" name="manual"/> > > </fencedevices> > > <rm> > > <failoverdomains> > > <failoverdomain name="bias-xen1" > nofailback="0" ordered="1" restricted="0"> > > <failoverdomainnode > name="xen1.smartechcorp.net" priority="1"/> > > <failoverdomainnode > name="xen2.smartechcorp.net" priority="2"/> > > </failoverdomain> > > <failoverdomain name="bias-xen2" > nofailback="0" ordered="1" restricted="0"> > > <failoverdomainnode > name="xen1.smartechcorp.net" priority="2"/> > > <failoverdomainnode > name="xen2.smartechcorp.net" priority="1"/> > > </failoverdomain> > > </failoverdomains> > > <resources/> > > <vm autostart="1" domain="bias-xen1" exclusive="0" > migrate="live" name="Windows1" path="/var/lib/xen/images" > recovery="relocate"/> > > <vm autostart="1" domain="bias-xen2" exclusive="0" > migrate="live" name="Linux1" path="/var/lib/xen/images" > recovery="relocate"/> > > </rm> > > </cluster> > > Thanks for any help, this is driving me crazy! > > > > --- > > > > Chris Edwards > Smartech Corp. > Div. of AirNet Group > > http://www.airnetgroup.com > > http://www.smartechcorp.net > > cedwards@xxxxxxxxxxxxxxxx > P: 423-664-7678 x114 > > C: 423-593-6964 > > F: 423-664-7680 > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster