Hello, positive short update on this topic. I figured out that tdbs are not a good idea to be shared and also not a good source to reside on GFS. When I make /var/lib/xenstored hostdependent and also mount a local filesystem (ext3) underneath. Everything works. 1. Now I can *LIVE* migrate a sharedroot clusternode DomU form one Dom0 to the other 2. Now I can fence from DomU another DomU running on another Dom0. That rocks!! BTW concerning TDBs: I had to do the same when I configured a samba cluster (some time ago) also using tdbs. There the tdbs where only cache files. So I didn't care. But now the tdbs seem to be somehow important. Does anybody now why tdbs are not to be hosted on GFS? It works but is very very slow. Marc. On Friday 19 October 2007 08:42:02 Marc Grimme wrote: > On Thursday 18 October 2007 22:00:33 Lon Hohberger wrote: > > On Wed, 2007-10-17 at 11:41 +0200, Marc Grimme wrote: > > > Hello, > > > we are currently discussing XEN with clustering support. > > > There came some questions we are not sure what the answer is. Perhaps > > > you can help ;-) . > > > > > > Background is: We are discussing a group of XEN Dom0 Hosts sharing all > > > devices and files via GFS. They themselves again host a couple of > > > virtually redhat-clustered DomU Hosts with or without gfs. > > > > > > 1. Live Migration of cluster DomU nodes: > > > When I live migrate a virtual DomU clusternode to another DOM0 XEN Host > > > the migration works ;-) , but the virtual clusternode is thrown out of > > > the cluster. Is this a "works as designed"? I think the problem are the > > > heartbeats not coming in proper time. > > > Does that lead to the conclusion that one cannot live migrate cluster > > > nodes? > > > > Depends. If you're using rgmanager to do migration, the migration is > > actually not live. In order to do live migration, > > change /usr/share/cluster/vm.sh... > > > > - where it says 'xm migrate ...' > > - change it to 'xm migrate -l ...' > > Ok got it. > Still did you try to live migrate a cluster node? > > > That should enable live migration. > > > > > 2. Fencing: > > > How about fencing of the virtual Dom-U Clusternodes. You are never sure > > > on which Dom-0 Node runs our Dom-U Clusternode. Is the fencing via > > > fence_xvm[d] supported on such an environment? That means how does a > > > virtual DomU clusternode X running on Dom0 Xen Host x know that if > > > virtual DomU clusternode Y running on Dom0 Xen Host y is running there > > > when it is getting the fence request to fence Host y where it is > > > running? > > > > Yes. Fence_xvmd is designed (specifically) to handle the case where the > > dom0 hosting a particular domU is not known. Note that this only works > > on RHEL5 with openais and such; fence_xvmd uses AIS checkpoints to store > > virtual machine locations. > > > > Notes: > > * the parent dom0 cluster still needs fencing, too :) > > Yes. Thats in place. Check. > > > * do not mix domU and dom0 in the same cluster, > > I didn't. Check. > > > * all domUs within a dom0 cluster must have different domain names, > > Ups. hostname -d on dom0 and hostname -d on domu need to be different? > What if they are empty? > Or do you mean some other domainname? > Dom0: > [root@axqa01_2 ~]# hostname -d > [root@axqa01_2 ~]# > DomU: > [root@axqa03_1 ~]# hostname -d > cc.atix > > > * do *not* reuse /etc/xen/fence_xvm.key between multiple dom0 clusters > > I just did not use it. > Dom0: > [root@axqa01_2 ~]# ps ax | grep [f]ence_xvmd > 1932 pts/1 S+ 0:00 fence_xvmd -ddddd -f -c none -C none > So on axqa01_2 runs axqa03_2 and on axqa01_1 runs axqa03_1 > Then when I do a > ./fence_xvm -ddddd -C none -c none -H axqa03_2 on axqa03_1 I get the > following: > Waiting for response > Received 264 bytes > Adding IP 127.0.0.1 to list (family 2) > Adding IP 10.1.2.1 to list (family 2) > Adding IP 192.168.10.40 to list (family 2) > Adding IP 192.168.122.1 to list (family 2) > Closing Netlink connection > ipv4_listen: Setting up ipv4 listen socket > ipv4_listen: Success; fd = 3 > Setting up ipv4 multicast send (225.0.0.12:1229) > Joining IP Multicast group (pass 1) > Joining IP Multicast group (pass 2) > Setting TTL to 2 for fd4 > ipv4_send_sk: success, fd = 4 > sign_request: no-op (HASH_NONE) > Sending to 225.0.0.12 via 127.0.0.1 > Setting up ipv4 multicast send (225.0.0.12:1229) > Joining IP Multicast group (pass 1) > Joining IP Multicast group (pass 2) > Setting TTL to 2 for fd4 > ipv4_send_sk: success, fd = 4 > sign_request: no-op (HASH_NONE) > Sending to 225.0.0.12 via 10.1.2.1 > Setting up ipv4 multicast send (225.0.0.12:1229) > Joining IP Multicast group (pass 1) > Joining IP Multicast group (pass 2) > Setting TTL to 2 for fd4 > ipv4_send_sk: success, fd = 4 > sign_request: no-op (HASH_NONE) > Sending to 225.0.0.12 via 192.168.10.40 > Setting up ipv4 multicast send (225.0.0.12:1229) > Joining IP Multicast group (pass 1) > Joining IP Multicast group (pass 2) > Setting TTL to 2 for fd4 > ipv4_send_sk: success, fd = 4 > sign_request: no-op (HASH_NONE) > Sending to 225.0.0.12 via 192.168.122.1 > Waiting for connection from XVM host daemon. > Issuing TCP challenge > tcp_challenge: no-op (AUTH_NONE) > Responding to TCP challenge > tcp_response: no-op (AUTH_NONE) > TCP Exchange + Authentication done... > Waiting for return value from XVM host > Remote: Operation failed > > on axqa01_2: > ------ ---- ----- ----- > axqa03_2 cb165cce-1798-daf9-1252-12a2347a9fc7 00002 00002 > Domain-0 00000000-0000-0000-0000-000000000000 00002 00001 > Storing axqa03_2 > libvir: Xen Daemon error : GET operation failed: > Domain UUID Owner State > ------ ---- ----- ----- > axqa03_2 cb165cce-1798-daf9-1252-12a2347a9fc7 00002 00002 > Domain-0 00000000-0000-0000-0000-000000000000 00002 00001 > Storing axqa03_2 > Request to fence: axqa03_2 > axqa03_2 is running locally > Plain TCP request > libvir: Xen Daemon error : GET operation failed: > libvir: error : invalid argument in __virGetDomain > libvir: Xen Store error : out of memory > tcp_response: no-op (AUTH_NONE) > tcp_challenge: no-op (AUTH_NONE) > Rebooting domain axqa03_2... > [[ XML Domain Info ]] > <domain type='xen'> > <name>axqa03_2</name> > <uuid>1732aae45a110676113df9e7da458b61</uuid> > <os> > <type>linux</type> > <kernel>/var/lib/xen/boot/vmlinuz-2.6.18-52.el5xen</kernel> > <initrd>/var/lib/xen/boot/initrd_sr-2.6.18-52.el5xen.img</initrd> > </os> > <currentMemory>366592</currentMemory> > <memory>366592</memory> > <vcpu>2</vcpu> > <on_poweroff>destroy</on_poweroff> > <on_reboot>restart</on_reboot> > <on_crash>restart</on_crash> > <devices> > <disk type='block' device='disk'> > <driver name='phy'/> > <source dev='sds'/> > <target dev='sds'/> > </disk> > <disk type='file' device='disk'> > <driver name='file'/> > <source file='/var/lib/xen/images/axqa03_2.localdisk.dd'/> > <target dev='sda'/> > </disk> > <interface type='bridge'> > <mac address='aa:00:00:00:00:12'/> > <source bridge='xenbr0'/> > </interface> > <interface type='bridge'> > <mac address='00:16:3e:43:90:d2'/> > <source bridge='xenbr1'/> > </interface> > <console/> > </devices> > </domain> > > [[ XML END ]] > Virtual machine is Linux > Unlinkiking os block > [[ XML Domain Info (modified) ]] > <?xml version="1.0"?> > <domain type="xen"> > <name>axqa03_2</name> > <uuid>1732aae45a110676113df9e7da458b61</uuid> > <currentMemory>366592</currentMemory> > <memory>366592</memory> > <vcpu>2</vcpu> > <on_poweroff>destroy</on_poweroff> > <on_reboot>restart</on_reboot> > <on_crash>restart</on_crash> > <devices> > <disk type="block" device="disk"> > <driver name="phy"/> > <source dev="sds"/> > <target dev="sds"/> > </disk> > <disk type="file" device="disk"> > <driver name="file"/> > <source file="/var/lib/xen/images/axqa03_2.localdisk.dd"/> > <target dev="sda"/> > </disk> > <interface type="bridge"> > <mac address="aa:00:00:00:00:12"/> > <source bridge="xenbr0"/> > </interface> > <interface type="bridge"> > <mac address="00:16:3e:43:90:d2"/> > <source bridge="xenbr1"/> > </interface> > <console/> > </devices> > </domain> > > [[ XML END ]] > [REBOOT] Calling virDomainDestroy > virDomainDestroy() failed: -1 > Sending response to caller... > > libvir: Xen Daemon error : GET operation failed: > Domain UUID Owner State > ------ ---- ----- ----- > axqa03_2 cb165cce-1798-daf9-1252-12a2347a9fc7 00002 00002 > Domain-0 00000000-0000-0000-0000-000000000000 00002 00001 > Storing axqa03_2 > > on axqa01_1: > > Domain UUID Owner State > ------ ---- ----- ----- > axqa03_1 8f89affa-4330-d281-9622-98665e4816c2 00001 00002 > Domain-0 00000000-0000-0000-0000-000000000000 00001 00001 > Storing axqa03_1 > Domain UUID Owner State > ------ ---- ----- ----- > axqa03_1 8f89affa-4330-d281-9622-98665e4816c2 00001 00002 > Domain-0 00000000-0000-0000-0000-000000000000 00001 00001 > Storing axqa03_1 > Request to fence: axqa03_2 > Evaluating Domain: axqa03_2 Last Owner: 2 State 2 > Domain UUID Owner State > ------ ---- ----- ----- > axqa03_1 8f89affa-4330-d281-9622-98665e4816c2 00001 00002 > Domain-0 00000000-0000-0000-0000-000000000000 00001 00001 > Storing axqa03_1 > Domain UUID Owner State > > Any ideas? > > Marc. > > > -- Lon > > > > -- > > Linux-cluster mailing list > > Linux-cluster@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Gruss / Regards, > > Marc Grimme > http://www.atix.de/ http://www.open-sharedroot.org/ > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster