On Fri, Aug 14, 2009 at 10:30:17AM +0800, Wengang Wang wrote: > anything else needed? versions are good > #don't know where 239.192.110.55 comes from. does it matter? That's good, the multicast address is generated by cman. > [root@desk ~]# cman_tool nodes > Node Sts Inc Joined Name > 1 M 88 2009-08-14 17:44:00 cool > 2 M 76 2009-08-14 17:43:52 desk > > [root@cool ~]# cman_tool nodes > Node Sts Inc Joined Name > 1 M 84 2009-08-14 09:46:06 cool > 2 M 88 2009-08-14 09:46:06 desk > > [root@desk ~]# group_tool > groupd not running > fence domain > member count 2 > victim count 0 > victim now 0 > master nodeid 2 > wait state none > members 1 2 > > [root@cool ~]# group_tool > groupd not running > fence domain > member count 2 > victim count 0 > victim now 0 > master nodeid 2 > wait state none > members 1 2 the cluster is fine > #checking for difference, seems only the group_tool has different > output. problem is in groupd? it starts automatically? I didn't start it > by hand. what I do is "service cman start" on both nodes and then "mount > ...." on both nodes. groupd is not needed > node desk: > Aug 14 18:07:44 desk gfs_controld[2206]: recovery_uevent mg not found 1 > Aug 14 18:07:44 desk gfs_controld[2206]: recovery_uevent mg not found 1 > Aug 14 18:07:44 desk gfs_controld[2206]: recovery_uevent mg not found 1 There's a problem here, but it's not clear what has gone wrong. Could you try this again and after these messages appear send the output of "gfs_control dump" from both nodes? > Aug 14 10:14:00 cool kernel: INFO: task mount.gfs2:2458 blocked for more > than 120 seconds. The second mount is stuck probably because the first went bad. Dave -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster