Might be a firewall issue. Doing a netstat -nl listed ports that were not mentioned in the "simple setup" docs for me. Specifically 14567. Cheers, -d On Wed, 2005-10-19 at 12:49 -0700, Tim Spaulding wrote: > Hi All, > > I have a couple of machines that I'm trying to cluster. The machines are freshly installed FC4 > machines that have been fully updated and running the latest kernel. They are configured to use > the lvm2 by default so lvm2 and dm was already installed. I'm following the directions in the > usage.txt off RedHat's web site. I compile the cluster tarball, run depmod, and start ccsd > without issue. When I do a cman_tool join -w on each node, both nodes start cman and join the > cluster, but the cluster is apparently partitioned (i.e. they both see the cluster and are joined > to it, but the two nodes cannot see that the other node is joined). I've searched around and > haven't found anything specific to this symptom. I have a feeling that it's something to do with > my network configuration. Any help would be appreciated. > > Both machines are i686 archs with dual NICs. The NICs are connected to networks that do not route > to each other. One network (eth0 on both machines) is a development network. The other network > (eth1) is our corporate network. I'm trying to configure the cluster to use the dev network > (eth0). > > Here's the output from uname: > > Linux ctclinux1.clam.com 2.6.13-1.1526_FC4 #1 Wed Sep 28 19:15:10 EDT 2005 i686 i686 i386 > GNU/Linux > Linux ctclinux2.clam.com 2.6.13-1.1526_FC4 #1 Wed Sep 28 19:15:10 EDT 2005 i686 i686 i386 > GNU/Linux > > Here's the network configuration on ctclinux1: > > eth0 Link encap:Ethernet HWaddr 00:01:03:26:5C:C9 > inet addr:192.168.36.200 Bcast:192.168.36.255 Mask:255.255.255.0 > inet6 addr: fe80::201:3ff:fe26:5cc9/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:7260 errors:0 dropped:0 overruns:0 frame:0 > TX packets:350 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:449183 (438.6 KiB) TX bytes:27853 (27.2 KiB) > Interrupt:10 Base address:0xec00 > > eth1 Link encap:Ethernet HWaddr 00:B0:D0:41:0F:65 > inet addr:10.10.10.200 Bcast:10.10.255.255 Mask:255.255.0.0 > inet6 addr: fe80::2b0:d0ff:fe41:f65/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:57450 errors:0 dropped:0 overruns:1 frame:0 > TX packets:12957 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:10040767 (9.5 MiB) TX bytes:1962029 (1.8 MiB) > Interrupt:5 Base address:0xe880 > > eth1:1 Link encap:Ethernet HWaddr 00:B0:D0:41:0F:65 > inet addr:10.10.10.204 Bcast:10.10.255.255 Mask:255.255.0.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > Interrupt:5 Base address:0xe880 > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:17568 errors:0 dropped:0 overruns:0 frame:0 > TX packets:17568 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:3692600 (3.5 MiB) TX bytes:3692600 (3.5 MiB) > > sit0 Link encap:IPv6-in-IPv4 > NOARP MTU:1480 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > Kernel IP routing table > Destination Gateway Genmask Flags Metric Ref Use Iface > 192.168.36.0 * 255.255.255.0 U 0 0 0 eth0 > 10.74.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.72.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.75.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.73.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.10.0.0 * 255.255.0.0 U 0 0 0 eth1 > 169.254.0.0 * 255.255.0.0 U 0 0 0 eth1 > default 10.10.1.1 0.0.0.0 UG 0 0 0 eth1 > > cat /etc/hosts > 10.10.10.200 ctclinux1-svc > 192.168.36.200 ctclinux1-cls > 192.168.36.201 ctclinux2-cls > 10.10.10.201 ctclinux2-svc > > Here's the network configuration on ctclinux2: > > ifconfig -a > eth0 Link encap:Ethernet HWaddr 00:01:03:D4:80:7C > inet addr:192.168.36.201 Bcast:192.168.36.255 Mask:255.255.255.0 > inet6 addr: fe80::201:3ff:fed4:807c/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:7702 errors:0 dropped:0 overruns:1 frame:0 > TX packets:282 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:477769 (466.5 KiB) TX bytes:22444 (21.9 KiB) > Interrupt:10 Base address:0xec00 > > eth1 Link encap:Ethernet HWaddr 00:B0:D0:41:0F:9B > inet addr:10.10.10.201 Bcast:10.10.255.255 Mask:255.255.0.0 > inet6 addr: fe80::2b0:d0ff:fe41:f9b/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:53846 errors:0 dropped:0 overruns:1 frame:0 > TX packets:7759 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:5733713 (5.4 MiB) TX bytes:1155588 (1.1 MiB) > Interrupt:5 Base address:0xe880 > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:17912 errors:0 dropped:0 overruns:0 frame:0 > TX packets:17912 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:3401868 (3.2 MiB) TX bytes:3401868 (3.2 MiB) > > sit0 Link encap:IPv6-in-IPv4 > NOARP MTU:1480 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > route > Kernel IP routing table > Destination Gateway Genmask Flags Metric Ref Use Iface > 192.168.36.0 * 255.255.255.0 U 0 0 0 eth0 > 10.74.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.72.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.75.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.73.0.0 192.168.36.10 255.255.255.0 UG 0 0 0 eth0 > 10.10.0.0 * 255.255.0.0 U 0 0 0 eth1 > 169.254.0.0 * 255.255.0.0 U 0 0 0 eth1 > default 10.10.1.1 0.0.0.0 UG 0 0 0 eth1 > > cat /etc/hosts > 10.10.10.201 ctclinux2-svc > 192.168.36.201 ctclinux2-cls > 192.168.36.200 ctclinux1-cls > 10.10.10.200 ctclinux1-svc > > Here's the cluster configuration file: > > <?xml version="1.0"?> > <cluster name="cl_tic" config_version="1"> > <cman> > </cman> > > <clusternodes> > <clusternode name="ctclinux1-cls"> > <fence> > <method name="single"> > <device name="human" nodename="ctclinux1-cls"/> > </method> > </fence> > </clusternode> > > <clusternode name="ctclinux2-cls"> > <fence> > <method name="single"> > <device name="human" nodename="ctclinux2-cls"/> > </method> > </fence> > </clusternode> > </clusternodes> > > <fence_devices> > <fence_device name="human" agent="fence_manual"/> > </fence_devices> > </cluster> > > Here's the cluster information from ctclinux1 after the cluster is started and joined: > > cman_tool -d join -w > nodename ctclinux1.clam.com not found > nodename ctclinux1 (truncated) not found > nodename ctclinux1 doesn't match ctclinux1-cls (ctclinux1-cls in cluster.conf) > nodename ctclinux1 doesn't match ctclinux2-cls (ctclinux2-cls in cluster.conf) > nodename localhost (if lo) not found > selected nodename ctclinux1-cls > setup up interface for address: ctclinux1-cls > Broadcast address for c824a8c0 is ff24a8c0 > > cman_tool status > Protocol version: 5.0.1 > Config version: 1 > Cluster name: cl_tic > Cluster ID: 6429 > Cluster Member: Yes > Membership state: Cluster-Member > Nodes: 1 > Expected_votes: 2 > Total_votes: 1 > Quorum: 2 Activity blocked > Active subsystems: 0 > Node name: ctclinux1-cls > Node addresses: 192.168.36.200 > > cman_tool nodes > Node Votes Exp Sts Name > 1 1 2 M ctclinux1-cls > > Here's the cluster information from ctclinux2 after the cluster is started and joined: > > cman_tool -d join -w > nodename ctclinux2.clam.com not found > nodename ctclinux2 (truncated) not found > nodename ctclinux2 doesn't match ctclinux1-cls (ctclinux1-cls in cluster.conf) > nodename ctclinux2 doesn't match ctclinux2-cls (ctclinux2-cls in cluster.conf) > nodename localhost (if lo) not found > selected nodename ctclinux2-cls > setup up interface for address: ctclinux2-cls > Broadcast address for c924a8c0 is ff24a8c0 > > cman_tool status > Protocol version: 5.0.1 > Config version: 1 > Cluster name: cl_tic > Cluster ID: 6429 > Cluster Member: Yes > Membership state: Cluster-Member > Nodes: 1 > Expected_votes: 2 > Total_votes: 1 > Quorum: 2 Activity blocked > Active subsystems: 0 > Node name: ctclinux2-cls > Node addresses: 192.168.36.201 > > cman_tool nodes > Node Votes Exp Sts Name > 1 1 2 M ctclinux2-cls > > Let me know if there is more information that I need to provide. As an aside, I've tried reducing > the quorum count with no difference in behavior and I've tried using multicast which fails on the > cman_tool join with an "Unknown Host" error. I'm open to any other suggestions. > > Thanks, > > tims > > > > > __________________________________ > Yahoo! Mail - PC Magazine Editors' Choice 2005 > http://mail.yahoo.com > > -- > > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster