I am running a 2 node cluster on CentOS 5.4.
Until yesterday I had always started the cluster services manually
(service cman start, service clvmd start ...) and already noticed that
service clvmd start never finishes - so I used "service clvmd start &"
to get a working shell back.
Yesterday I tried autostarting the cluster services (just cman, clvmd,
gfs; no rgmanager yet) with system boot up, and now clvmd is stuck the
same. I did not look into the order of init scripts before; since clvmd
is started long before sshd, I now cannot log in, neither remote nor local.
I had started up both nodes for this test, the other one without
autostarting the services, so I can log into this and check for cluster
sanity.
group_tool ls looks like this:
[root@pclus1cent5-01 ~]# group_tool ls
type level name id state
fence 0 default 00010002 none
[1 2]
dlm 1 clvmd 00020002 none
[1 2]
dlm 1 XenImages 00020001 none
[1]
dlm 1 XenConfigs 00040001 none
[1]
dlm 1 rgmanager 00050001 none
[1]
gfs 2 XenImages 00010001 none
[1]
gfs 2 XenConfigs 00030001 none
[1]
XenConfigs and XenImages are GFS volumes - looks like they are mounted
on both nodes, but clvmd only runs on one (which is weird since the GFS
volumes are set up on clustered LVs). Or do I misinterpret the output?
Now the question is:
Can I determine from the working node what the problem on the stuck node
is? Can I force clvmd to finish or give up?
And do I have to do/can I do something to the clvmd init script to make
it work as expected?
Dirk
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster