I am building up a cluster of XEN Guests with root file system residing
on a file on an GFS filesystem (iscsi actually). Each cluster node
mounts an GFS file system residing on an iscsi device. For performance
reasons, both the iscsi device and the physical nodes (part also of a
cluster) use two gigabit ethernet with bonding and LACP.
For the physical machines, I had to insert a sleep 30 on the
/etc/init.d/iscsi script before the iscsi login, in order to wait for
the bond interface to come up, otherwise the iscsi devices are not seen
and no gfs mount is possible.
Then, going to the cluster of XEN Guests, they work fine, I am able to
migrate each one to a different physical node without problems on the
guest. When I reboot or fence a guest, the guest cluster breaks, e.g.
the quorum is dissolved and I have to fence ALL the nodes and reboot
them in order for the cluster to restart. Does it have to do with the
xen bridge going up and down for a time longer than the heartbeat timeout ?
Is it still valid (and so the solution to the problems I found) this
entry in the FAQ ?
When I reboot a xen dom, I get cluster errors and it gets fenced. What's
going on and how do I fix it?
As I understand it, the problem is due to the fact that xen nodes tear
down and rebuild the ethernet nic after cluster suite has started. We're
working on a more permanent solution. In the meantime, here is a
workaround:
1. Edit the file: /etc/xen/xend-config.sxp line. Locate the line that
reads:
(network-script network-bridge)
Change that line to read:
(network-script /bin/true)
2. Create and/or edit file /etc/sysconfig/network-scripts/ifcfg-eth0
to look something like:
DEVICE=eth0
ONBOOT=yes
BRIDGE=xenbr0
HWADDR=XX:XX:XX:XX:XX:XX
Where XX:XX:XX:XX:XX:XX is the mac address of your network card.
3. Create and/or edit file
/etc/sysconfig/network-scripts/ifcfg-xenbr0 to look something like:
DEVICE=xenbr0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.0.0.116
NETMASK=255.255.255.0
GATEWAY=10.0.0.254
TYPE=Bridge
DELAY=0
Substitute your appropriate IP address, netmask and gateway
information.
Thanks, Paolo
begin:vcard
fn:Paolo Marini
n:Marini;Paolo
org:Prisma Engineering srl
adr;dom:;;via Petrocchi 4;Milano;Italy;20152
email;internet:paolom@xxxxxxxxxxxxx
tel;work:+39 02 26113507
tel;fax:+39 02 26113597
tel;cell:+39 335 6525835
x-mozilla-html:TRUE
url:http://www.prisma-eng.com
version:2.1
end:vcard
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster