Hi everyone, Have run into a strange problem on our RH cluster installation. We have a cluster that uses iscsi shared storage for GFS2. It's been running for months with no problems. Today, the app on one node died. I logged in, assumed things were fenced, and tried to go about my business of restarting it. After some fiddling, I got the box back in the cluster fine. It just happened again, and I've dug in a bit more. I was wrong - the failed node has not been fenced. The last thing in dmesg on the failing node is: GFS2: fsid=: Trying to join cluster "lock_dlm", "sensors:rrd_gfs" GFS2: fsid=sensors:rrd_gfs.1: Joined cluster. Now mounting FS... GFS2: fsid=sensors:rrd_gfs.1: jid=1, already locked for use GFS2: fsid=sensors:rrd_gfs.1: jid=1: Looking at journal... GFS2: fsid=sensors:rrd_gfs.1: jid=1: Done Any reads or writes to the mounted filesystem hangs like the DLM can't get locks. Connectivity to the storage is good: no interfaces show dropped packets or errors. cman_tool reports the node as healthy: [root@sensor01 ~]# cman_tool status Version: 6.0.1 Config Version: 14 Cluster Name: sensors Cluster Id: 14059 Cluster Member: Yes Cluster Generation: 368 Membership state: Cluster-Member Nodes: 2 Expected votes: 3 Total votes: 2 Quorum: 2 Active subsystems: 7 Flags: Ports Bound: 0 11 Node name: sensor01.dc3 Node ID: 1 Multicast addresses: 239.192.54.34 The missing vote is a third node that is not yet live, but it's been in that state of rweeks now with no problems. [root@sensor01 ~]# cman_tool nodes -f Node Sts Inc Joined Name 1 M 360 2008-08-25 16:24:29 sensor01.dc3 Last fenced: 2008-08-25 16:04:25 by leaf8b-2.dc3 2 M 364 2008-08-25 16:24:29 sensor02.dc3 3 X 364 sensor03.dc3 Node has not been fenced since it went down The fencing above is when I rebooted the node - because processes were hung on GFS I/O, I had to hard reset the box, which caused the other nodes to fence it. Cluster LVM operations seem to work fine - I can query all LVM objects without a problem. But as soon as I try a filesystem operation, boom, I hang. Any hints on where I can start looking? -- Ross Vandegrift ross@xxxxxxxxxxx "The good Christian should beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell." --St. Augustine, De Genesi ad Litteram, Book II, xviii, 37 -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster