Brett Cave wrote: > hi, > > On a 6 node cluster, 2 nodes (1 & 6) were fenced. On coming back up, > the 2 nodes were not able to start the cman service. > > All the other nodes have activity blocked. Disallowed nodes are (from > cman_tool status) > node2: 3,4,5 > node3: 2,4,5 > node4: 2,3,5 > node5: 2,3,4 > > node1 & node6 - cman not running. > > Am using qdisk, and all running nodes have the disallowed list flagged > as "d" - disallowed. > Each node then also has: > X (not a cluster member) for qdisk and the 2 fenced nodes that cman > will not start on. > d (on the 3 running nodes other than current) > M (on the self-node - i.e. if run on node2, then node2 = M) > > > This is what I get in logs when I try start cman on 1 of the X nodes... > openais[10465]: CMAN: Joined a cluster with disallowed nodes. must die > > > I cant get the nodes to restart cman - "service gfs stop" to unmount > gfs mounts hangs... the following process is not able to complete: > /sbin/umount.gfs /my/mountpoint1 > > Is there a way to get the cluster to recover from this? Going to be > fencing all the nodes now to get the system up. The cman_tool man page has some detail on disallowed mode. But also check the version. cman in RHEL5.3 has a bug that can cause this to happen. I believe a hot fix is in the works somewhere... Chrissie -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster