On Fri, 2005-07-08 at 08:27 +0200, Gunther Schlegel wrote: > I have been running 1.2.22. Yup, that fixed the status problem, but... > > Also, the most recent errata fixed a signal handling problem which > > broke JVMs from running under it. > There have not been any log messages. > > > I'd try the latest release from RHN (clumanager-1.2.26.1-1). ... it is very important to note that JVMs weren't the only thing that broke because of the signal bug. The signal bug was not fixed until to 1.2.26.1 (latest errata). Some processes use signals to communicate and avoid deadlocks or blocking, but if the signals are blocked, they don't much help with those problems... As an example - a process which calls alarm(5) to set a timer to wake itself up right before it calls, say, a blocking select(). 5 seconds later, SIGALRM comes in - but because it is blocked, the process gets stuck in select() forever. > > If that doesn't work, I'd call Red Hat Support... > > While calling support is always on option, I am pretty much sure that it > will not lead to a solution. In the end they will not be able to > reproduce it and I can't test on a customers production system. I suspect that the first thing they would have you do is try the latest errata from RHN (which fixes the signal problem): https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=153070 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161060 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143867 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=149059 (Yeah, it's that bad.) ... which is why I recommended trying it *before* calling support. > Do not point me to test systems -- they are there, but they do not have > the problem. Seems to be related to the workload of the machine, which > is hard to simulate. > Hmm, I will probably not start up the cluster again... :( (snipped from earlier) Use your own judgment, and make the choices that are right for you and your customer, whatever they are. I am sorry I could not be more helpful. Good luck. -- Lon -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster