Dear all,
I am running GFS 6.1 with dlm on a cluster (4 nodes + front-end) of
dual-headed Opterons and RHEL4U3. Because of some problems (kernel
panic...) I had to hard boot some nodes of the cluster. Now, some gfs
partitions won't mount. They will simply keep waiting forever for the
"join" of the GFS group:
So... three questions:
- What is the join exactly doing ? Cluster status is fine, everybody is
member ...
- What does the status code mean in the cman_tool output ?
- What can I do to restart this cluster ?
NB: Before testing this (below) I rebooted the complete cluster and
gfs_fsck'ed /all nodes /with everything unmounted.
----------------------------------------------------------------------------------------------------
root # service clvmd start
root #: service gfs start
Mounting GFS filesystems: # forever !
in another console I get:
root # dmesg | tail
...
GFS: fsid=globcover:baieGC2b.0: jid=14: Done
GFS: fsid=globcover:baieGC2b.0: jid=15: Trying to acquire journal lock...
GFS: fsid=globcover:baieGC2b.0: jid=15: Looking at journal...
GFS: fsid=globcover:baieGC2b.0: jid=15: Done
GFS: Trying to join cluster "lock_dlm", "globcover:baieGC3a"
root # cman_tool services
Service Name GID LID State Code
Fence Domain: "default" 11 2 run -
[1 5 4 3 2]
DLM Lock Space: "clvmd" 12 3 run -
[1 5 4 3 2]
DLM Lock Space: "baieGC2b" 13 4 run -
[1 5]
DLM Lock Space: "baieGC3a" 15 6 run -
[1 5 2 4 3]
GFS Mount Group: "baieGC2b" 14 5 run -
[1 5]
GFS Mount Group: "baieGC3a" 0 7 join
S-2,2,4
[]
root # cman_tool status
Protocol version: 5.0.1
Config version: 8
Cluster name: globcover
Cluster ID: 53692
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 5
Expected_votes: 5
Total_votes: 5
Quorum: 3
Active subsystems: 9
Node name: globcover-fe
Node addresses: 10.1.1.1
root # cman_tool nodes
Node Votes Exp Sts Name
1 1 5 M globcover-fe
2 1 5 M compute-0-3
3 1 5 M compute-0-2
4 1 5 M compute-0-1
5 1 5 M compute-0-0
----------------------------------------------------------------------------------------------------
Thanks,
--
------------------------------------------------------------------------
Fernando NIÑO CNES - BPi 2102
Medias-France/IRD 18, Av. Edouard Belin
Tél: 05.61.27.40.74 31401 Toulouse Cedex 9
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster