Hi, Sorry to bother you all once more. I'm seeing two problems when trying to get ccs/cman working. On my Celeron 2GHz, when I try to start ccsd and cman, all is well. I start ccsd, then 'cman_tool join', and the machine begins periodically broadcasting such packets: 23:22:26.300381 IP 10.0.0.1.6809 > 10.0.0.255.6809: UDP, length 24 23:22:26.300491 IP 10.0.0.1.6809 > 10.0.0.255.6809: UDP, length 24 However, when I try the exact same thing on a Dual Xeon in the same subnet, I get this: 23:19:51.095492 IP 10.0.0.3.32770 > 255.255.255.255.50007: UDP, length 20 23:19:51.344805 arp who-has 10.0.0.9 tell 10.0.0.3 23:19:52.344396 arp who-has 10.0.0.9 tell 10.0.0.3 23:19:53.344257 arp who-has 10.0.0.9 tell 10.0.0.3 The machine begins ARPing for 10.0.0.9 -- but that IP isn't even used at all! It doesn't broadcast like the other machines do, and after waiting for a while, both machines decide to create a new cluster instead of trying to talk to each other. Futhermore, when I try to 'cman_tool leave' on the dual proc, I get: Jun 26 22:51:43 phi kernel: CMAN: we are leaving the cluster Jun 26 22:51:43 phi ccsd[9833]: Received bad communication type on cluster socket. Jun 26 22:51:49 phi last message repeated 106830 times syslogd then starts looping, until I kill ccsd. On the uniproc, I don't get any such error at all when I issue a leave: Jun 26 22:51:40 xi kernel: CMAN: we are leaving the cluster Jun 26 22:51:40 xi ccsd[2181]: Unable to bind cluster socket: Transport endpoint is not connected Jun 26 22:51:40 xi ccsd[2181]: Exiting... I tried a UP kernel (exact same one as on the uniproc) on the dual proc, but same result. Anyone any clues? Anything obvious I forgot? I've attached /etc/cluster/cluster.xml -- it's identical on both machines, they both run the same kernel, and same binary packages (I hope.) Do I have to provide more info? cheers, Lennert
<?xml version="1.0"?> <cluster name="alpha" config_version="1"> <cman> </cman> <nodes> <node name="phi" votes="2"> <fence> <method name="single"> <device name="human" ipaddr="10.0.0.3"/> </method> </fence> </node> <node name="xi" votes="1"> <fence> <method name="single"> <device name="human" ipaddr="10.0.0.1"/> </method> </fence> </node> </nodes> <fence_devices> <device name="human" agent="fence_manual"/> </fence_devices> </cluster>