* David Daney (ddaney@xxxxxxxxxxxxxxxxxx) [100329 18:54]: > On 03/27/2010 04:07 PM, Andreas Barth wrote: >> * David Daney (ddaney@xxxxxxxxxxxxxxxxxx) [100326 19:57]: >>> Also you could try running with the attached patch. It is not the best >>> watchdog, but it will print the register state for each core when things >>> get stuck. Occasionally that is enough to see where the problem is. >> >> Thanks. >> >> As our logging has only limited buffer size, I'd be happy about an >> variant of the patch which doesn't reboot but just let the machine >> hang after the third occurence. >> >> Any chances for it? > You could just sit in a loop kicking the watchdog timer after you get to > the NMI handler. That should prevent a reset, but still print the > machine state. I need to admit that I'm totally unable to make code from that statement. Could you (or someone else) give me a hand? Also please note that it usually takes a few hours to crash the machine, and I didn't see anything in the normal syslog. Andi