Re: New Cluster Installation Starts Partitioned

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

Thanks, that solved it.  I had opened up the right ports on my primary node but had forgotten to
do the same on the secondary node reinforcing Murphy's Second Law of Clustering.  It's always the
little things.  :)

Thanks again,

tims

--- Mark Hlawatschek <hlawatschek@xxxxxxx> wrote:

> Hi Tim,
> 
> make sure that the cmans on both nodes can talk to each other. I
> observed this problem when iptables wasn't configured correctly. If you
> have an active iptables config shut it down and try again.
> 
> Hope that helps ...
> 
> Mark 
> 
> On Wed, 2005-10-19 at 12:49 -0700, Tim Spaulding wrote:
> > Hi All,
> > 
> > I have a couple of machines that I'm trying to cluster.  The machines are freshly installed
> FC4
> > machines that have been fully updated and running the latest kernel.  They are configured to
> use
> > the lvm2 by default so lvm2 and dm was already installed.  I'm following the directions in the
> > usage.txt off RedHat's web site.  I compile the cluster tarball, run depmod, and start ccsd
> > without issue.  When I do a cman_tool join -w on each node, both nodes start cman and join the
> > cluster, but the cluster is apparently partitioned (i.e. they both see the cluster and are
> joined
> > to it, but the two nodes cannot see that the other node is joined).  I've searched around and
> > haven't found anything specific to this symptom.  I have a feeling that it's something to do
> with
> > my network configuration.  Any help would be appreciated.
> > 
> > Both machines are i686 archs with dual NICs.  The NICs are connected to networks that do not
> route
> > to each other.  One network (eth0 on both machines) is a development network.  The other
> network
> > (eth1) is our corporate network.  I'm trying to configure the cluster to use the dev network
> > (eth0).
> > 
> > Here's the output from uname:
> > 
> > Linux ctclinux1.clam.com 2.6.13-1.1526_FC4 #1 Wed Sep 28 19:15:10 EDT 2005 i686 i686 i386
> > GNU/Linux
> > Linux ctclinux2.clam.com 2.6.13-1.1526_FC4 #1 Wed Sep 28 19:15:10 EDT 2005 i686 i686 i386
> > GNU/Linux
> > 
> > Here's the network configuration on ctclinux1:
> > 
> > eth0      Link encap:Ethernet  HWaddr 00:01:03:26:5C:C9
> >           inet addr:192.168.36.200  Bcast:192.168.36.255  Mask:255.255.255.0
> >           inet6 addr: fe80::201:3ff:fe26:5cc9/64 Scope:Link
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:7260 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:350 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:1000
> >           RX bytes:449183 (438.6 KiB)  TX bytes:27853 (27.2 KiB)
> >           Interrupt:10 Base address:0xec00
> > 
> > eth1      Link encap:Ethernet  HWaddr 00:B0:D0:41:0F:65
> >           inet addr:10.10.10.200  Bcast:10.10.255.255  Mask:255.255.0.0
> >           inet6 addr: fe80::2b0:d0ff:fe41:f65/64 Scope:Link
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:57450 errors:0 dropped:0 overruns:1 frame:0
> >           TX packets:12957 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:1000
> >           RX bytes:10040767 (9.5 MiB)  TX bytes:1962029 (1.8 MiB)
> >           Interrupt:5 Base address:0xe880
> > 
> > eth1:1    Link encap:Ethernet  HWaddr 00:B0:D0:41:0F:65
> >           inet addr:10.10.10.204  Bcast:10.10.255.255  Mask:255.255.0.0
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           Interrupt:5 Base address:0xe880
> > 
> > lo        Link encap:Local Loopback
> >           inet addr:127.0.0.1  Mask:255.0.0.0
> >           inet6 addr: ::1/128 Scope:Host
> >           UP LOOPBACK RUNNING  MTU:16436  Metric:1
> >           RX packets:17568 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:17568 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:3692600 (3.5 MiB)  TX bytes:3692600 (3.5 MiB)
> > 
> > sit0      Link encap:IPv6-in-IPv4
> >           NOARP  MTU:1480  Metric:1
> >           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> > 
> > Kernel IP routing table
> > Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
> > 192.168.36.0    *               255.255.255.0   U     0      0        0 eth0
> > 10.74.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.72.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.75.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.73.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.10.0.0       *               255.255.0.0     U     0      0        0 eth1
> > 169.254.0.0     *               255.255.0.0     U     0      0        0 eth1
> > default         10.10.1.1       0.0.0.0         UG    0      0        0 eth1
> > 
> > cat /etc/hosts
> > 10.10.10.200    ctclinux1-svc
> > 192.168.36.200  ctclinux1-cls
> > 192.168.36.201  ctclinux2-cls
> > 10.10.10.201    ctclinux2-svc
> > 
> > Here's the network configuration on ctclinux2:
> > 
> > ifconfig -a
> > eth0      Link encap:Ethernet  HWaddr 00:01:03:D4:80:7C
> >           inet addr:192.168.36.201  Bcast:192.168.36.255  Mask:255.255.255.0
> >           inet6 addr: fe80::201:3ff:fed4:807c/64 Scope:Link
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:7702 errors:0 dropped:0 overruns:1 frame:0
> >           TX packets:282 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:1000
> >           RX bytes:477769 (466.5 KiB)  TX bytes:22444 (21.9 KiB)
> >           Interrupt:10 Base address:0xec00
> > 
> > eth1      Link encap:Ethernet  HWaddr 00:B0:D0:41:0F:9B
> >           inet addr:10.10.10.201  Bcast:10.10.255.255  Mask:255.255.0.0
> >           inet6 addr: fe80::2b0:d0ff:fe41:f9b/64 Scope:Link
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:53846 errors:0 dropped:0 overruns:1 frame:0
> >           TX packets:7759 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:1000
> >           RX bytes:5733713 (5.4 MiB)  TX bytes:1155588 (1.1 MiB)
> >           Interrupt:5 Base address:0xe880
> > 
> > lo        Link encap:Local Loopback
> >           inet addr:127.0.0.1  Mask:255.0.0.0
> >           inet6 addr: ::1/128 Scope:Host
> >           UP LOOPBACK RUNNING  MTU:16436  Metric:1
> >           RX packets:17912 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:17912 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:3401868 (3.2 MiB)  TX bytes:3401868 (3.2 MiB)
> > 
> > sit0      Link encap:IPv6-in-IPv4
> >           NOARP  MTU:1480  Metric:1
> >           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> > 
> > route
> > Kernel IP routing table
> > Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
> > 192.168.36.0    *               255.255.255.0   U     0      0        0 eth0
> > 10.74.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.72.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.75.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.73.0.0       192.168.36.10   255.255.255.0   UG    0      0        0 eth0
> > 10.10.0.0       *               255.255.0.0     U     0      0        0 eth1
> > 169.254.0.0     *               255.255.0.0     U     0      0        0 eth1
> > default         10.10.1.1       0.0.0.0         UG    0      0        0 eth1
> > 
> > cat /etc/hosts
> > 10.10.10.201    ctclinux2-svc
> > 192.168.36.201  ctclinux2-cls
> > 192.168.36.200  ctclinux1-cls
> > 10.10.10.200    ctclinux1-svc
> > 
> > Here's the cluster configuration file:
> > 
> > <?xml version="1.0"?>
> > <cluster name="cl_tic" config_version="1">
> >         <cman>
> >         </cman>
> > 
> >         <clusternodes>
> >                 <clusternode name="ctclinux1-cls">
> >                         <fence>
> >                                 <method name="single">
> >                                         <device name="human" nodename="ctclinux1-cls"/>
> >                                 </method>
> >                         </fence>
> >                 </clusternode>
> > 
> >                 <clusternode name="ctclinux2-cls">
> >                         <fence>
> >                                 <method name="single">
> >                                         <device name="human" nodename="ctclinux2-cls"/>
> >                                 </method>
> >                         </fence>
> >                 </clusternode>
> >         </clusternodes>
> > 
> >         <fence_devices>
> >                 <fence_device name="human" agent="fence_manual"/>
> >         </fence_devices>
> > </cluster>
> > 
> > Here's the cluster information from ctclinux1 after the cluster is started and joined:
> > 
> > cman_tool -d join -w
> > nodename ctclinux1.clam.com not found
> > nodename ctclinux1 (truncated) not found
> > nodename ctclinux1 doesn't match ctclinux1-cls (ctclinux1-cls in cluster.conf)
> > nodename ctclinux1 doesn't match ctclinux2-cls (ctclinux2-cls in cluster.conf)
> > nodename localhost (if lo) not found
> > selected nodename ctclinux1-cls
> > setup up interface for address: ctclinux1-cls
> > Broadcast address for c824a8c0 is ff24a8c0
> > 
> > cman_tool status
> > Protocol version: 5.0.1
> > Config version: 1
> > Cluster name: cl_tic
> > Cluster ID: 6429
> > Cluster Member: Yes
> > Membership state: Cluster-Member
> > Nodes: 1
> > Expected_votes: 2
> > Total_votes: 1
> > Quorum: 2  Activity blocked
> > Active subsystems: 0
> > Node name: ctclinux1-cls
> > Node addresses: 192.168.36.200
> > 
> > cman_tool nodes
> > Node  Votes Exp Sts  Name
> >    1    1    2   M   ctclinux1-cls
> > 
> > Here's the cluster information from ctclinux2 after the cluster is started and joined:
> > 
> > cman_tool -d join -w
> > nodename ctclinux2.clam.com not found
> > nodename ctclinux2 (truncated) not found
> > nodename ctclinux2 doesn't match ctclinux1-cls (ctclinux1-cls in cluster.conf)
> > nodename ctclinux2 doesn't match ctclinux2-cls (ctclinux2-cls in cluster.conf)
> > nodename localhost (if lo) not found
> > selected nodename ctclinux2-cls
> > setup up interface for address: ctclinux2-cls
> > Broadcast address for c924a8c0 is ff24a8c0
> > 
> > cman_tool status
> > Protocol version: 5.0.1
> > Config version: 1
> > Cluster name: cl_tic
> > Cluster ID: 6429
> > Cluster Member: Yes
> > Membership state: Cluster-Member
> > Nodes: 1
> > Expected_votes: 2
> > Total_votes: 1
> > Quorum: 2  Activity blocked
> > Active subsystems: 0
> > Node name: ctclinux2-cls
> > Node addresses: 192.168.36.201
> > 
> > cman_tool nodes
> > Node  Votes Exp Sts  Name
> >    1    1    2   M   ctclinux2-cls
> > 
> > Let me know if there is more information that I need to provide.  As an aside, I've tried
> reducing
> > the quorum count with no difference in behavior and I've tried using multicast which fails on
> the
> > cman_tool join with an "Unknown Host" error.  I'm open to any other suggestions.
> > 
> > Thanks,
> > 
> > tims
> > 
> > 
> > 	
> > 		
> > __________________________________ 
> > Yahoo! Mail - PC Magazine Editors' Choice 2005 
> > http://mail.yahoo.com
> > 
> > --
> > 
> > Linux-cluster@xxxxxxxxxx
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> -- 
> Mark Hlawatschek <hlawatschek@xxxxxxx>
> 
> --
> 
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 



		
__________________________________ 
Yahoo! Music Unlimited 
Access over 1 million songs. Try it free.
http://music.yahoo.com/unlimited/

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux