On Mon, Jun 20, 2005 at 12:07:26PM -0400, Eric Kerin wrote: > I got the following oops messages on my cluster nodes, both at different > times. Once was on node A, I was running a clustat, and did a ctrl-4 to > kill it, (it was taking a long while to run, seemed to be blocked by > something). The second time after doing that OOPS#1 showed up. The > second oops showed up on the b node, the cluster was running, and I > wasn't actually doing anything outside of watching a tcpdump to watch > some data flow by, went away for about 10 minutes, and when I came back > node B had blocked up, and was fenced by A. The OOPS was in the > messages file. > Well, they're both the same oops. It looks like a race between the AST being delivered and the process shutting down. I'm not in a position to look at it in more detail ATM - I'll investigate when I get back to base. It might be good to have this in bugzilla. IYWBSK -- patrick -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster