Please send your fence configuration and cluster.conf Regards 2010/1/25, Alex Urbanowicz <alex.urbanowicz@xxxxxxxxx>: > Hello > > I have a problem with shared GFS resource on a 12-node Cluster Manager > cluster. > > The cluster starts up properly if all nodes are booted at once. Any major > interaction with one of the nodes (reboot, cman restart) causes the GFS to > lock out the GFS, and for the cluster to fal into some unstable split state. > > In this state, logs, clustat and "cman_tool status" report the cluster as > fully connected and working, while "cman_tool resources" reports only the > fence resource in JOIN_START_WAIT (or JOIN_STOP WAIT, depending on what was > done to the cluster in the meantime) state with overlapping but different > node sets, depending on the node I run the "cman_tool resources" command. > > So far, the only functioning method to get the cluster out of the state is > to manually reboot all the nodes at once, but this is unfeasible due to > uptime expectations and high load carried by the cluster. > > We're completely in the dark about the possible cause of the problem, any > help is appreciated. > > TIA > > Alex > -- Jorge Palma Escobar Ingeniero de Sistemas Red Hat Linux Certified Engineer Certificate Nº 804005089418233 -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster