Cluster of XEN guests unstable when rebooting a node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am building up a cluster of XEN Guests with root file system residing on a file on an GFS filesystem (iscsi actually). Each cluster node mounts an GFS file system residing on an iscsi device. For performance reasons, both the iscsi device and the physical nodes (part also of a cluster) use two gigabit ethernet with bonding and LACP.

For the physical machines, I had to insert a sleep 30 on the /etc/init.d/iscsi script before the iscsi login, in order to wait for the bond interface to come up, otherwise the iscsi devices are not seen and no gfs mount is possible.

Then, going to the cluster of XEN Guests, they work fine, I am able to migrate each one to a different physical node without problems on the guest. When I reboot or fence a guest, the guest cluster breaks, e.g. the quorum is dissolved and I have to fence ALL the nodes and reboot them in order for the cluster to restart. Does it have to do with the xen bridge going up and down for a time longer than the heartbeat timeout ?

Is it still valid (and so the solution to the problems I found) this entry in the FAQ ?

When I reboot a xen dom, I get cluster errors and it gets fenced. What's going on and how do I fix it?

As I understand it, the problem is due to the fact that xen nodes tear down and rebuild the ethernet nic after cluster suite has started. We're working on a more permanent solution. In the meantime, here is a workaround:

 1. Edit the file: /etc/xen/xend-config.sxp line. Locate the line that
    reads:

    (network-script network-bridge)

    Change that line to read:

    (network-script /bin/true)

 2. Create and/or edit file /etc/sysconfig/network-scripts/ifcfg-eth0
    to look something like:

    DEVICE=eth0
    ONBOOT=yes
    BRIDGE=xenbr0
    HWADDR=XX:XX:XX:XX:XX:XX

    Where XX:XX:XX:XX:XX:XX is the mac address of your network card.

 3. Create and/or edit file
    /etc/sysconfig/network-scripts/ifcfg-xenbr0 to look something like:

    DEVICE=xenbr0
    ONBOOT=yes
    BOOTPROTO=static
    IPADDR=10.0.0.116
    NETMASK=255.255.255.0
    GATEWAY=10.0.0.254
    TYPE=Bridge
    DELAY=0

    Substitute your appropriate IP address, netmask and gateway
    information.


Thanks, Paolo

begin:vcard
fn:Paolo Marini
n:Marini;Paolo
org:Prisma Engineering srl
adr;dom:;;via Petrocchi 4;Milano;Italy;20152
email;internet:paolom@xxxxxxxxxxxxx
tel;work:+39 02 26113507
tel;fax:+39 02 26113597
tel;cell:+39 335 6525835
x-mozilla-html:TRUE
url:http://www.prisma-eng.com
version:2.1
end:vcard

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux