Nuno Fernandes wrote: > Hi, > > we have an cluster with 7 machines with a SAN. We are using them to > provide virtual machines, so we are using clvmd. > > At some point we are unable to access any of the pv/lv/vg tools. They > are all stuck. From stracing them i've come to the conclusion that they > are waiting for clvmd. > They could be waiting for fencing to complete. Have a look at the output from group_tool, that will tell you which services have recovered after a node has joined or left the cluster Chrissie > Nuno Fernandes > > in host xen1: > > Linux blade01.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4 > 14:13:09 EST 2008 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.32-4.el5 > > cman-2.0.84-2.el5_2.1 > > PID TTY STAT TIME COMMAND > > 20874 ? D< 0:00 \_ [dlm_recoverd] > > 20854 pts/1 S+ 0:00 \_ /bin/sh /sbin/service clvmd start > > 20861 pts/1 S+ 0:00 \_ /bin/bash /etc/init.d/clvmd start > > 20931 pts/1 S+ 0:00 \_ /usr/sbin/vgscan -d > > 20869 ? Ssl 0:00 clvmd -T40 > > ps ax -o pid,cmd,wchan > > 20874 [dlm_recoverd] - > > ------------------------------ > > Connection to xen1 closed. > > in host xen2: > > Linux blade02.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56 > WEST 2007 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.16-3.el5 > > cman-2.0.64-1.0.1.el5 > > PID TTY STAT TIME COMMAND > > 22662 ? D< 0:00 \_ [dlm_recoverd] > > 22613 ? Ssl 0:02 clvmd -T40 > > ps ax -o pid,cmd,wchan > > 22662 [dlm_recoverd] - > > ------------------------------ > > Connection to xen2 closed. > > in host xen3: > > Linux blade03.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56 > WEST 2007 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.16-3.el5 > > cman-2.0.64-1.0.1.el5 > > PID TTY STAT TIME COMMAND > > 22236 ? D< 0:00 \_ [dlm_recoverd] > > 22231 ? Ssl 0:02 clvmd -T40 > > ps ax -o pid,cmd,wchan > > Connection to xen3 closed. > > 22236 [dlm_recoverd] dlm_wait_function > > ------------------------------ > > in host xen4: > > Linux blade04.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56 > WEST 2007 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.16-3.el5 > > cman-2.0.64-1.0.1.el5 > > PID TTY STAT TIME COMMAND > > 25097 ? D< 0:00 \_ [dlm_recoverd] > > 25092 ? Ssl 0:02 clvmd -T40 > > ps ax -o pid,cmd,wchan > > 25097 [dlm_recoverd] dlm_wait_function > > ------------------------------ > > Connection to xen4 closed. > > in host xen5: > > Linux blade05.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4 > 14:13:09 EST 2008 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.32-4.el5 > > cman-2.0.84-2.el5_2.1 > > PID TTY STAT TIME COMMAND > > 22333 ? D< 0:00 \_ [dlm_recoverd] > > 22328 ? Ssl 0:02 clvmd -T40 > > ps ax -o pid,cmd,wchan > > 22333 [dlm_recoverd] - > > ------------------------------ > > Connection to xen5 closed. > > in host xen6: > > Linux blade06.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4 > 14:13:09 EST 2008 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.32-4.el5 > > cman-2.0.84-2.el5_2.1 > > PID TTY STAT TIME COMMAND > > ps ax -o pid,cmd,wchan > > ------------------------------ > > Connection to xen6 closed. > > in host xen7: > > Linux blade07.dc.xpto.com 2.6.18-92.1.13.el5xen #1 SMP Wed Sep 24 > 20:01:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux > > lvm2-cluster-2.02.32-4.el5 > > cman-2.0.84-2.el5 > > cman-2.0.84-2.el5_2.1 > > PID TTY STAT TIME COMMAND > > 19793 ? D< 0:00 \_ [dlm_recoverd] > > 19788 ? Ssl 0:01 clvmd -T40 > > ps ax -o pid,cmd,wchan > > 19793 [dlm_recoverd] - > > ------------------------------ > > Connection to xen7 closed. > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster