Regarding Multicast: A symptom I had when multicasting was disabled was "ccs_tool update cluster.conf" didn't work, ie didn't push out the updated cluster.conf (which makes some sense!). So you can check by updating the version of cluster.conf on one member and running this (from in /etc/cluster). Have you got all your firewall rules done or turned off? Your error message seems to be saying a problem with ricci (that 11111 port). On RH 5.1 on a two node cluster I had too many issues with luci and ricci misbehaving or giving wrong information. The command line tools worked much better for me. "cman_tool status" :-) Bevan Broun Solutions Architect Ardec International http://www.ardec.com.au http://www.lisasoft.com http://www.terrapages.com Sydney ----------------------- Suite 112,The Lower Deck 19-21 Jones Bay Wharf Pirrama Road, Pyrmont 2009 Ph: +61 2 8570 5000 Fax: +61 2 8570 5099 -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Harri.Paivaniemi@xxxxxxxxxxxxxxx Sent: Tuesday, 25 November 2008 4:43 PM To: neuroticimbecile@xxxxxxxxx; linux-cluster@xxxxxxxxxx Subject: RE: Unable to retrieve batch 1493885544 status fromx.x.x.86:11111: module scheduled for execution Hi, Are you sure multicast is working in your switch? Openais uses it... I have had several really odd misbehaviours because of multicast not working... -hjp -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx on behalf of eric rosel Sent: Mon 11/24/2008 19:02 To: linux-cluster@xxxxxxxxxx Subject: Unable to retrieve batch 1493885544 status fromx.x.x.86:11111: module scheduled for execution Hi All, I'm trying to set up a 2-node cluster using luci. As of now, I have only configured it with a single service, with a single resource: an IP address, so I could test if the IP address fails over to the other node. So far, it doesn't. In /var/log/messages of the first node, it says: "Unable to retrieve batch 1493885544 status from x.x.x.86:11111: module scheduled for execution" It seems that each node is unaware of the other, "cman_tool nodes" says, respectively: ===<snip>=== Node Sts Inc Joined Name 1 M 48 2008-11-24 23:46:04 x.x.x.85 2 X 0 x.x.x.86 ===<snip>=== Node Sts Inc Joined Name 1 X 0 x.x.x.85 2 M 72 2008-11-24 23:32:43 x.x.x.86 ===<snip>=== My /etc/cluster/cluster.conf contains: ===<snip>=== <?xml version="1.0"?> <cluster alias="binary.cluster" config_version="18" name="binary.cluster"> <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="202.81.160.85" nodeid="1" votes="1"> <fence/> </clusternode> <clusternode name="202.81.160.86" nodeid="2" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices/> <rm> <resources> <ip address="202.81.160.87" monitor_link="0"/> </resources> <service autostart="1" exclusive="1" name="binary.service" recovery="relocate"> <ip ref="202.81.160.87"/> </service> <failoverdomains/> </rm> </cluster> ===<snip>=== I've already tried some things mentioned in the list archives: 1. "ccs_test connect" returns "Connect successful." on both nodes. 2. Although I'm using IP addresses in cluster.conf, I've added hostname definitions in /etc/hosts on both nodes: ===<snip>=== x.x.x.85 node1.domain.com node1 x.x.x.86 node2.domain.com node2 ===<snip>=== 3. When I manually copy /etc/cluster/cluster.conf to both nodes and do a "cman_tool version -r <version_number>", luci shows both nodes' "Status" as "Cluster Member". But when I try to make any changes using luci, the second node becomes "Not a Cluster Member"; and doing a "Have node join cluster" doesn't make it a member. I'm running on CentOS 5.2 with: luci-0.12.0-7.el5.centos.3 ricci-0.12.0-7.el5.centos.3 cman-2.0.84-2.el5_2.1 rgmanager-2.0.38-2.el5_2.1 Any pointers on how to make this work? TIA, -eric -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster