On 08/07/2012 01:07 AM, Chip Burke wrote:
I had a node crash (actually, lost power) and now when the cluster comes back up none of the PV/VG/Lvs that contain the GFS2 volumes can be found. Pvscan, lvscan, vgscan etc all hang. # pvscan -vvvv #lvmcmdline.c:1070 Processing: pvscan -vvvv #lvmcmdline.c:1073 O_DIRECT will be used #libdm-config.c:789 Setting global/locking_type to 3 #libdm-config.c:789 Setting global/wait_for_locks to 1 #locking/locking.c:271 Cluster locking selected. The output is more or less the same from lvscan and vgscan. The cluster is pretty basic and I was in the midst of configuring fencing when this went down, thus the config has no fence in it. <?xml version="1.0"?> <cluster config_version="5" name="Xanadu"> <clusternodes> <clusternode name="xanadunode1" nodeid="1"/> <clusternode name="xanadunode2" nodeid="2"/> </clusternodes> <cman expected_votes="3"/> <quorumd label="quorum"/> </cluster> Additionally the cluster logs all show similar unending messages such as: Aug 07 01:03:12 dlm_controld daemon cpg_join error retrying Aug 07 01:03:46 corosync [TOTEM ] Retransmit List: 13 Aug 07 01:04:04 gfs_controld cpg_mcast_joined retry 31200 protocol Aug 07 01:04:12 fenced daemon cpg_join error retrying Also # cman_tool status Version: 6.2.0 Config Version: 5 Cluster Name: Xanadu Cluster Id: 10121 Cluster Member: Yes Cluster Generation: 2084 Membership state: Cluster-Member Nodes: 2 Expected votes: 3 Quorum device votes: 1 Total votes: 3 Node votes: 1 Quorum: 2 Active subsystems: 11 Flags: Ports Bound: 0 11 178 Node name: xanadunode2 Node ID: 2 Multicast addresses: 239.192.39.176 Node addresses: 192.168.30.66 So cman is up and working. It seems that clvmd and the tools it depends on are simply not wanting to play nice. What do I have to do to get those volumes to mount?
Without a way to put the lost node into a known state, the only safe option remaining is to hang. This is by design. You have to add fencing to your cluster.
This explains it in detail; https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Concept.3B_Fencing -- Digimer Papers and Projects: https://alteeve.com -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster