Folks, While I was debugging the stale mgmt v3 lock issues surfaced from different test cases of rebalance (Mainly RHSC testing & BVT), I figured out few buggy places which are/might be causing this problem, some of them are listed below: 1. During locking phase, if we somehow fail to get back the locking response from other nodes (probably the time out) we never release the lock taken on the volume. 2. Locking/Brick op code doesn't have error handling code, it never injects any failure event, so state machine remains in kind of in-complete state. 3. IMO, no more than one thread should be in SM transaction processing mode, but looking at the code I feel this is not 100% safe. Looking at the above issues, I am just wondering is it worth spending effort on fixing them or the safest option would be to move the rebalance code into sync-op framework. Your feedback will be appreciated. Regards, Atin _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel