I had a similar issue. The problem was with the multicast routing. I was using two NICs on each node...one public (eth0) and one private (eth1), with the default gateway going out eth0.
The route for the multicast (224.x.x.x) was going out the default gateway and not reaching the other machine. By putting in a fixed route in for multicast:
route add -net 224.0.0.0/8 dev eth1
it all started working. This was my fix, it may not work for you. Also, I use the CVS code from http://sources.redhat.com/cluster and not the source RPMs from where you specified. ---------------------------------------------------------------------- - Rick Stevens, Senior Systems Engineer rstevens@xxxxxxxxxxxxxxx - - VitalStream, Inc. http://www.vitalstream.com - - - - Veni, Vidi, VISA: I came, I saw, I did a little shopping. - ----------------------------------------------------------------------
-- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster
Yeap, both boxes have two NICs. eth0 is public, and eth1 is private (192.168.2.x). I tried adding the route, and that didn't fix it. I've also tried disabling the private NIC before and running with one public NIC, and that didn't fix it either. One other interesting thing I noticed...when I run cman_tool join on nodeA, netstat shows ccsd trying to do this:
tcp 0 0 127.0.0.1:50006 127.0.0.1:739 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:738 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:737 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:736 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:743 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:742 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:741 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:740 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:727 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:731 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:730 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:729 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:728 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:735 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:734 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:733 TIME_WAIT -
tcp 0 0 127.0.0.1:50006 127.0.0.1:732 TIME_WAIT -
-vahram