IP27: Random hard locks after ~16hrs uptime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've had my Onyx2 running quite a bit lately doing compile runs, and it seems
that after about ~16 hours, there's a random possibility that the machine just
completely stops.  No errors printed anywhere, serial becomes completely
unresponsive.  I have to issue a 'rst' from the MSC to bring it back up again.

It's currently got dual IP31 R14000 node boards (500MHz), and for the most
part, runs great (I'll regret the electric bill later...).  Clearly a bug,
though, but I am not sure where to start debugging on this platform to find
this bug, since I can't trigger it manually.  Even tried an NMI interrupt,
since this machine has an NMI handler in the kernel, but all that does is reset
the machine.

Already ran an extensive memory test from the PROM and had no issues with that.
 Haven't tried running any of the more thorough hardware tests from IRIX, though.

Ideas?

-- 
Joshua Kinard
Gentoo/MIPS
kumba@xxxxxxxxxx
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux