One possibility may be that the hostnames used in your cluster.conf file resolve to 127.0.0.1 in /etc/hosts. Then the system will try to broadcast to the lo device. By the way, we're using CVS from Nov 21 with broadcast on dual-nic's. On Wed, Dec 01, 2004 at 10:39:49AM -0800, Rick Stevens wrote: > vahram wrote: > >Rick Stevens wrote: > > > >> > >>I had a similar issue. The problem was with the multicast routing. > >>I was using two NICs on each node...one public (eth0) and one private > >>(eth1), with the default gateway going out eth0. > >> > >>The route for the multicast (224.x.x.x) was going out the default > >>gateway and not reaching the other machine. By putting in a fixed route > >>in for multicast: > >> > >> route add -net 224.0.0.0/8 dev eth1 > >> > >>it all started working. This was my fix, it may not work for you. > >>Also, I use the CVS code from http://sources.redhat.com/cluster and > >>not the source RPMs from where you specified. > >>---------------------------------------------------------------------- > >>- Rick Stevens, Senior Systems Engineer rstevens@xxxxxxxxxxxxxxx - > >>- VitalStream, Inc. http://www.vitalstream.com - > >>- - > >>- Veni, Vidi, VISA: I came, I saw, I did a little shopping. - > >>---------------------------------------------------------------------- > >> > >>-- > >> > >>Linux-cluster@xxxxxxxxxx > >>http://www.redhat.com/mailman/listinfo/linux-cluster > > > > > >Yeap, both boxes have two NICs. eth0 is public, and eth1 is private > >(192.168.2.x). I tried adding the route, and that didn't fix it. I've > >also tried disabling the private NIC before and running with one public > >NIC, and that didn't fix it either. One other interesting thing I > >noticed...when I run cman_tool join on nodeA, netstat shows ccsd trying > >to do this: > > > >tcp 0 0 127.0.0.1:50006 127.0.0.1:739 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:738 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:737 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:736 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:743 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:742 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:741 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:740 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:727 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:731 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:730 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:729 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:728 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:735 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:734 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:733 > >TIME_WAIT - > >tcp 0 0 127.0.0.1:50006 127.0.0.1:732 > >TIME_WAIT - > > > > Looking back at your cluster.conf, I see you're using broadcast. I used > multicast because, in the first CVS checkout I did, broadcast didn't > work properly. It's possible your SRPMs also have that flaw. Why not > try multicast and see if that works. Add that route I mentioned and > here's my cluster.conf which you can crib: > > <?xml version="1.0"?> > <cluster name="test" config_version="1"> > > > <cman two-node="1" expected_votes="1"> > <multicast addr="224.0.0.1"/> > </cman> > > > <nodes> > <node name="gfs-01-001" votes="1"> > <multicast addr="224.0.0.1" interface="eth1"/> > <fence> > <method name="single"> > <device name="human" ipaddr="gfs-01-001"/> > </method> > </fence> > </node> > > > <node name="gfs-01-002" votes="1"> > <multicast addr="224.0.0.1" interface="eth1"/> > <fence> > <method name="single"> > <device name="human" ipaddr="gfs-01-002"/> > </method> > </fence> > </node> > </nodes> > > > <fence_devices> > <device name="human" agent="fence_manual"/> > </fence_devices> > </cluster> > > ---------------------------------------------------------------------- > - Rick Stevens, Senior Systems Engineer rstevens@xxxxxxxxxxxxxxx - > - VitalStream, Inc. http://www.vitalstream.com - > - - > - What's small, yellow and very, VERY dangerous? The root canary! - > ---------------------------------------------------------------------- > > -- > > Linux-cluster@xxxxxxxxxx > http://www.redhat.com/mailman/listinfo/linux-cluster -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen@xxxxxxxx ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>