Mark Chaney wrote: > Cam someone explain to me these errors and tell me how I should attempt to > resolve them? They both aren't happening at the same time exactly, its just > to errors that I don't truly understand. > > #################### > > ccsd[3192]: Attempt to close an unopened CCS descriptor (13590). > ccsd[3192]: Error while processing disconnect: > Invalid request descriptor > > ################## > > openais[5453]: [MAIN ] Killing node ratchet.local because it has rejoined > the cluster with existing state > I need to add this to the FAQ! What this message means is that a node was a valid member of the cluster once; it then left the cluster (without being fenced) and rejoined automatically. This can sometimes happen if the ethernet is disconnected for a time, usually a few seconds. If a node leave the cluster, it MUST rejoin using the cman_tool join command with no services running. The usual way to make this happen is to reboot the node, and if fencing is configured correctly that is what normally happens. It could be that fencing is too slow to manage this or that the cluster is made up of two nodes without a quorum disk so that the 'other' node doesn't have quorum and cannot initiate fencing. Another (more common) cause of this, is slow responding of some Cisco switches as documented here: http://www.openais.org/doku.php?id=faq:cisco_switches -- Chrissie -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster