Hi,
The service script 'vm.sh' gathers the vm service status using the 'xm'
command, however 'xm' relies on xend for proper operation. If xend is
down, bad things happen, up to destroying the VM.
I would have filed this issue with RH support, but I feel the solution
to this problem requires some qualified thinking in the first place.
What happened:
(Environment: production 4 node Xen / RHEL 5.2 cluster running 30+ pv
guests, Nagios monitoring, VM services configured to "Restart" failover)
a) xenconsoled died (this happens from time to time, monitored by Nagios).
b) Operations guy ran "service xend restart" to bring xenconsoled back
up. The restart operation implies that xend is down for a short period
of time.
c) rgmanager checked 3 VMs within the time frame xend was down. In vm.sh
xm list $OCF_RESKEY_name &> /dev/null
failed as xm could not communicate with xend. As a result rgmanager
tried to stop and restart these 3 VMs. As the time frame without xend
running has been quite short, xend was up again at the time rgmanager
ran "vm.sh stop" on the 3 VMs, therefore the 3 VMs were shut down
properly and came up afterwards.
This had been bad enough, but in fact we had been lucky, as I learned
when replaying the issue in our test environment. A notable difference
is that the test cluster is set to "Relocate" service recovery at the
moment. I also had to shut down xend for the test, so it was down
significantly longer than on the production cluster.
Background information on xend: xend is not required for Xen VMs to run,
it is only required to control VMs. Restarting xend while VMs are
running is a safe operation.
As a result of the longer xend downtime, "vm.sh stop" could not shut
down the VM, as the stop operation again uses 'xm' to communicate with xend.
Afterwards rgmanager started the VM on another cluster node, where it
came up perfectly well.
But the VM has never been shut down on the cluster node not running
xend. As a result the VM (which is installed on shared storage) was
running twice on two different nodes and the ext3-filesystems had been
mounted rw by both VM instances.
Any production server's filesystems would not have survived this for
more than a couple of seconds. So there is the risk of severe damage
here, especially as "relocate" is the default failover configuration.
As a workaround I propose to change xm.sh:
status()
{
+ xm info &> /dev/null || return 0
xm list $OCF_RESKEY_name &> /dev/null
if [ $? -eq 0 ]; then
return 0
fi
xm list migrating-$OCF_RESKEY_name &> /dev/null
return $?
}
Though: this is not good enough. xend may vanish between 'xm info' and
'xm list', leading to the described scenario.
Therefore xend should be a cluster service. The VM services would have
to depend in the xend service. If a VM fails rgmanager would have to
additionally check xend, and only act on the VM if xend has not failed
and the VM fails a second test (xend may have just come up again, so we
need to retest the VM).
If a VM has failed and it turns out that xend has failed as well,
rgmanager should try to reactivate xend.
If xend cannot be started, the cluster node has to be fenced. As xend is
not required for VMs to run, the VMs may be perfectly fine and must niot
be restarted on another node unless they are guaranteed to be down.
Any comment is welcome.
best regards, Gunther
--
Gunther Schlegel
Manager IT Infrastructure
.............................................................
Riege Software International GmbH Fon: +49 (2159) 9148 0
Mollsfeld 10 Fax: +49 (2159) 9148 11
40670 Meerbusch Web: www.riege.com
Germany E-Mail: schlegel@xxxxxxxxx
--- ---
Handelsregister: Managing Directors:
Amtsgericht Neuss HRB-NR 4207 Christian Riege
USt-ID-Nr.: DE120585842 Gabriele Riege
Johannes Riege
.............................................................
YOU CARE FOR FREIGHT, WE CARE FOR YOU
begin:vcard
fn:Gunther Schlegel
n:Schlegel;Gunther
org:Riege Software International GmbH;IT Infrastructure
adr:;;Mollsfeld 10;Meerbusch;;40670;Germany
email;internet:schlegel@xxxxxxxxx
title:Manager IT Infrastructure
tel;work:+49-2159-9148-0
tel;fax:+49-2159-9148-11
x-mozilla-html:FALSE
url:http://riege.com
version:2.1
end:vcard
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster