Re: Same OOPS on both cluster nodes, sepirated by a week.

Eric Kerin <eric@xxxxxxxxxxx> · Mon, 20 Jun 2005 16:17:10 -0400



On Mon, 2005-06-20 at 20:13 +0100, Patrick Caulfield wrote:
> On Mon, Jun 20, 2005 at 12:07:26PM -0400, Eric Kerin wrote:
> > I got the following oops messages on my cluster nodes, both at different
> > times.  Once was on node A, I was running a clustat, and did a ctrl-4 to
> > kill it, (it was taking a long while to run, seemed to be blocked by
> > something).  The second time after doing that OOPS#1 showed up.  The
> > second oops showed up on the b node, the cluster was running, and I
> > wasn't actually doing anything outside of watching a tcpdump to watch
> > some data flow by, went away for about 10 minutes, and when I came back
> > node B had blocked up, and was fenced by A.  The OOPS was in the
> > messages file. 
> > 
> 
> Well, they're both the same oops. It looks like a race between the AST being 
> delivered and the process shutting down. I'm not in a position to look at it
> in more detail ATM - I'll investigate when I get back to base.
> 
> It might be good to have this in bugzilla. IYWBSK

I will be so kind, Bug submitted #161146

Thanks,
Eric Kerin

--

Linux-cluster@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/linux-cluster