Hi, I was trying to configure Xen guests as
virtual services under Cluster Suite. My configuration is simple: Node one "d1" runs xen guest as
virtual service "vm_service1", and node one "d2" runs
virtual service "vm_service2". The /etc/cluster/cluster.conf file is
below: <?xml version="1.0"?> <cluster
alias="VM_Data_Cluster" config_version="112"
name="VM_Data_Cluster">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="300"/>
<clusternodes> <clusternode name="d1"
nodeid="1" votes="1"> <multicast
addr="225.0.0.1" interface="eth0"/> <fence> <method
name="1">
<device name="apc_power_switch" port="1"/> </method> </fence> </clusternode> <clusternode
name="d2" nodeid="2" votes="1"> <multicast
addr="225.0.0.1" interface="eth0"/> <fence> <method
name="1">
<device name="apc_power_switch" port="2"/> </method> </fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"> <multicast
addr="225.0.0.1"/>
</cman>
<fencedevices> <fencedevice
agent="fence_apc" ipaddr="X.X.X.X" login="apc" name="apc_power_switch" passwd="apc"/>
</fencedevices>
<rm> <failoverdomains> <failoverdomain
name="VM_d1_failover" ordered="0"
restricted="0"> <failoverdomainnode
name="d1" priority="1"/> </failoverdomain> <failoverdomain
name="VM_d2_failover" ordered="0"
restricted="0">
<failoverdomainnode name="d2" priority="1"/>
</failoverdomain> <resources/> <vm autostart="1"
domain="VM_d1_failover" exclusive="0"
name="vm_service1"
path="/virts/service1" recovery="relocate"/> <vm autostart="1"
domain="VM_d2_failover" exclusive="0"
name="vm_service2"
path="/virts/service2" recovery="relocate"/>
</rm>
<totem consensus="4800" join="60"
token="10000" token_retransmits_before_loss_const="20"/>
<fence_xvmd family="ipv4"/> </cluster> On guests “vm_service1” and “vm_service2” I have
configured the second cluster. <cluster
alias="SV_Data_Cluster" config_version="29"
name="SV_Data_Cluster">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes> <clusternode
name="d11" nodeid="1" votes="1"> <fence> <method
name="1">
<device domain="d11" name="virtual_fence"/> </method> </fence> </clusternode> <clusternode
name="d12" nodeid="2" votes="1"> <fence> <method
name="1">
<device domain="d12" name="virtual_fence"/> </method> </fence> </clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices> <fencedevice
agent="fence_xvm" name="virtual_fence”/>
</fencedevices>
<rm> …
</rm> </cluster> The problem is that the
fence_xvmd/fence_xvm mechanism doesn’t work due to propably
misconfiguration of multicast. Physical nodes “d1” and
“d2” and xen guests “vm_service1” and
“vm_service2” have two
ethernet interfaces: private– 10.0.200.x (eth0) and public (eth1). On physical nodes, “fence_xvmd”
deamon listens defaults on eth1 interface: [root@d2 ~]# netstat -g IPv6/IPv4 Group Memberships Interface RefCnt Group --------------- ------
--------------------- lo 1 ALL-SYSTEMS.MCAST.NET eth0 1 225.0.0.1 eth0 1
ALL-SYSTEMS.MCAST.NET eth1 1 225.0.0.12 eth1 1 ALL-SYSTEMS.MCAST.NET virbr0 1
ALL-SYSTEMS.MCAST.NET lo 1 ff02::1 …. Next when I make on xen guest
“vm_service1” a test to
fence guest “vm_service2” I
get: [root@d11 cluster]# /sbin/fence_xvm -H d12
-ddddd Debugging threshold is now 5 -- args @ 0xbf8aea70 --
args->addr = 225.0.0.12
args->domain = d12
args->key_file = /etc/cluster/fence_xvm.key
args->op = 2
args->hash = 2
args->auth = 2
args->port = 1229
args->family = 2
args->timeout = 30
args->retr_time = 20
args->flags = 0
args->debug = 5 -- end args -- Reading in key file
/etc/cluster/fence_xvm.key into 0xbf8ada1c (4096 max size) Actual key length = 4096 bytesOpening
/dev/urandom Sending to 225.0.0.12 via 127.0.0.1 Opening /dev/urandom Sending to 225.0.0.12 via X.X.X.X Opening /dev/urandom Sending to 225.0.0.12 via 10.0.200.124 Waiting for connection from XVM host
daemon. …. Waiting for connection from XVM host daemon. Timed out waiting for response On the node “d2” where
“vm_service2” is running I get: [root@d2 ~]# /sbin/fence_xvmd -fddd Debugging threshold is now 3 -- args @ 0xbfc54e3c --
args->addr = 225.0.0.12
args->domain = (null)
args->key_file = /etc/cluster/fence_xvm.key
args->op = 2
args->hash = 2
args->auth = 2
args->port = 1229
args->family = 2
args->timeout = 30
args->retr_time = 20
args->flags = 1
args->debug = 3 -- end args -- Reading in key file
/etc/cluster/fence_xvm.key into 0xbfc53e3c (4096 max size) Actual key length = 4096 bytesOpened ckpt
vm_states My Node ID = 1 Domain UUID Owner State ------ ---- ----- ----- Domain-0
00000000-0000-0000-0000-000000000000
00001 00001 vm_service2 2dd8193f-e4d4-f41c-a4af-f5b30d19fe00
00001 00001 Storing vm_service2 Domain UUID Owner State ------ ---- ----- ----- Domain-0
00000000-0000-0000-0000-000000000000 00001 00001 vm_service2 2dd8193f-e4d4-f41c-a4af-f5b30d19fe00 00001
00001 Storing vm_service2 Request to fence: d12. Evaluating Domain: d12 Last Owner/State Unknown Domain UUID Owner State ------ ---- ----- ----- Domain-0
00000000-0000-0000-0000-000000000000 00001 00001 vm_service2 2dd8193f-e4d4-f41c-a4af-f5b30d19fe00
00001 00001 Storing vm_service2 Request to fence: d12 Evaluating Domain: d12 Last Owner/State Unknown So it looks like the fence_xvmd and
fence_xvm cannot communicate earch other. But “fence_xvm” on
“vm_service1” sends multicast packets through all interfaces and
node “d2” can receive them. Tcpdump on node “d2” says
that the node “d2” receives the packages: [root@d2 ~]# tcpdump -i peth0 -n host 225.0.0.12 listening on peth0, link-type EN10MB
(Ethernet), capture size 96 bytes 17:50:47.972477 IP 10.0.200.124.filenet-pch
> 225.0.0.12.novell-zfs: UDP, length 176 17:50:49.960841 IP 10.0.200.124.filenet-pch
> 225.0.0.12.novell-zfs: UDP, length 176 17:50:51.977425 IP 10.0.200.124.filenet-pch
> 225.0.0.12.novell-zfs: UDP, length 176 [root@d2 ~]# tcpdump -i peth1 -n host 225.0.0.12 listening on peth1, link-type EN10MB
(Ethernet), capture size 96 bytes 17:51:26.168132 IP X.X.X.X.filenet-pch >
225.0.0.12.novell-zfs: UDP, length 176 17:51:28.184802 IP X.X.X.X.filenet-pch >
225.0.0.12.novell-zfs: UDP, length 176 17:51:30.196875 IP X.X.X.X.filenet-pch >
225.0.0.12.novell-zfs: UDP, length 176 But I can’t see the
“node2” sends anything to xen guest “vm_service1”. So
“fence_xvm” gets timeout. What can I do wrong? Cheers Agnieszka
Kukałowicz NASK, Polska.pl |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster