On Tue, 2005-01-25 at 19:09 -0700, Kim Lux wrote: > How do I figure out what is causing the problem ? I've checked the > system logs, but they are clean. With lots of crashes lately but never an oops or panic message to report, I was about to have the same question, but just to be safe I left memtest86 running today, and found bad ram :( I'm running with mem=236M for now to block out the bad parts, but has there been an RFE for the badram kernel patch? (not seeing any on bugzilla, not even closed GOAWAY or BADIDEA or whatever) We've already got a version of memtest86 that can spit out the badram values... Assuming the labor of maintaining it in the patchset isn't too high, I think it's probably a better thing to recognize that people are going to use imperfect hardware and give them a way to deal with it, than to decide that everyone needs new hardware. (start flamewar now) http://rick.vanrein.org/linux/badram/ If that turns out not to be the (only) problem, what *is* the best way to get debug info from bad crashes, where even alt-sysrq-jitsu does no good? I know about the serial console capability; lately I've also seen stuff about diskdump and netdump... which of these is most likely to survive serious kernel problems long enough to get a useful report that can be bugzilla'ed?