On Tue, 8 Jan 2008, David Teigland wrote: > On Fri, Jan 04, 2008 at 04:18:45PM -0500, Charlie Brady wrote: > > We've reduced the application code to a simple test case. The following > > code run on each node will soon block, and doesn't receive signals until > > the peer node is shutdown: ... > Yes, this stresses a problematic design limitation in the RHEL4 dlm where > the dlm master node is ping-ponging all over the place and becomes so > unstable that everything comes to a halt. One possible work-around is to > modify the program to hold a lock on filedes to keep the master stable, > e.g. hold a zero length lock at some unused offset like 0xFFFFFF. Thanks. I've passed the advice on. -- Charlie -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster