On Thu, May 03, 2007 at 11:27:08AM +0200, Sebastian Walter wrote: > Does anybody have a solution for this? Is there any documentation about > the Code messages? > > > Sebastian Walter wrote: > >Thanks for your help. These are /proc/cluster/services: > > > >###master > >Service Name GID LID State Code > >Fence Domain: "default" 6 2 run - > >[3 2 1] > > > >DLM Lock Space: "clvmd" 5 3 join > >S-6,20,3 > >[3 2 1] > > > >### node1: > >Service Name GID LID State Code > >Fence Domain: "default" 6 2 run - > >[3 2 1] > > > >DLM Lock Space: "clvmd" 5 3 update > >U-4,1,1 > >[2 3 1] > > > >### node2: > >Service Name GID LID State Code > >Fence Domain: "default" 6 3 run - > >[3 2 1] > > > >DLM Lock Space: "clvmd" 5 4 update > >U-4,1,1 > >[2 3 1] This says that the dlm is stuck in recovery on all the nodes. Which version of the code are you using? Has this happened more than once? Does the cluster have quorum? (cman_tool status) What does /proc/cluster/dlm_debug show from all nodes? What are the dlm threads waiting on? (ps ax -o pid,stat,wchan,cmd | grep dlm) Dave -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster