Hi there, I have a cluster with three nodes (all clone HL DL380 G4s) attached to a Fibre SAN (HP MSA1000) and serving a number of GFS filesystems. My OS is Ubuntu Dapper (6.06) and my kernel is 2.6.15-29-amd64-server. These machines have been working nicely for a long time. On the weekend I "apt-get updated" to the latest version of the Dapper redhat-cluster-suite package (1.20060222-0ubuntu6.1). Now, when the cluster boots the first two nodes to come up are able to see the GFS filesystem. However, the third node to come up hangs at the point of starting the clvm service. Concomitantly, I see the following message in /var/log/syslog of one of the other machines in the cluster: Oct 28 14:42:18 machinea kernel: [ 1681.325152] CMAN: node machinec rejoining Oct 28 14:42:20 machinea kernel: [ 1683.528299] Extra connection from node 2 attempted It does not seem to matter which order the nodes come up in - it is always the third node to boot that will hang when starting clvmd. I have included my cluster.conf file below for reference - I can include any additional diagnostics as required. Any help would be most appreciated! Stephen <?xml version="1.0"?> <cluster config_version="14" name="alpha_cluster"> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="machineaint" votes="1"> <fence> <method name="1"> <device name="machinea_ILO"/> </method> </fence> </clusternode> <clusternode name="machinebint" votes="1"> <fence> <method name="1"> <device name="machineb_ILO"/> </method> </fence> </clusternode> <clusternode name="machinecint" votes="1"> <fence> <method name="1"> <device name="machinec_ILO"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_ilo" hostname="192.168.81.200" login="Login" name="machinea_ILO" passwd="Passwd"/> <fencedevice agent="fence_ilo" hostname="192.168.81.199" login="Login" name="machineb_ILO" passwd="Passwd"/> <fencedevice agent="fence_ilo" hostname="192.168.81.197" login="Login" name="machinec_ILO" passwd="Passwd"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="fileservers" ordered="0" restricted="0"> <failoverdomainnode name="machineaint" priority="1"/> <failoverdomainnode name="machinebint" priority="1"/> <failoverdomainnode name="machinecint" priority="1"/> </failoverdomain> <failoverdomain name="backupers" ordered="0" restricted="1"> <failoverdomainnode name="machineaint" priority="1"/> <failoverdomainnode name="machinebint" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="192.168.81.98" monitor_link="1"/> </resources> <service autostart="1" domain="fileservers" exclusive="1" name="fileserver_ip"> <ip ref="192.168.81.98"/> </service> <service autostart="1" domain="backupers" name="backups"> <script file="/etc/init.d/dsmcad-init" name="TSM backup script"/> </service> </rm> </cluster> -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster