> node1: > Resource 0000010001218088 (parent 0000000000000000). Name (len=24) " 2 > 1100e7" > Local Copy, Master is node 2 > Granted Queue > Conversion Queue > Waiting Queue > 5eb00178 PR (EX) Master: 3eeb0117 LQ: 0,0x5 > node2: > Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) " 2 > 1100e7" > Master Copy > Granted Queue > 3eeb0117 PR Remote: 1 5eb00178 > Conversion Queue > Waiting Queue The state of the lock on node1 looks bad. I'm studying the code and struggling to understand how it could possibly arrive in that state. Some things to notice: - the lock is converting, it should be on the Conversion Queue, not the Waiting Queue - lockqueue_state is 0, so either node1 has not sent a remote request to node2 at all, or node1 did send something and already received some kind of reply so it's not waiting for a reply any longer - the state of the lock on node2 looks normal Did you check for suspicious syslog messages on both nodes? Did any nodes on this fs mount, unmount or fail around the time this happened? Has this happened before? If you'd like to try to reproduce this with some dlm debugging I could send you a patch (although this is such an odd state I'm not sure yet where I'd begin to add debugging.) Dave -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster