From: Frederic Weisbecker <fweisbec@xxxxxxxxx> Date: Wed, 14 Apr 2010 00:05:43 +0200 > [ 126.704213] hrtimer: interrupt took 8360145 ns Ok, I can trigger this and then I get watchdog timeouts saying cpu X was stuck for 61 seconds etc. This is just using the function tracer. If I dump all the cpu regs (Frederic, this is obtained by using 'echo "y" >/proc/sysrq-trigger' or BREAK+Y on console, you might find this useful :-) then I see the stuck cpus are in the function tracer code path, often in the ring buffer entry allocator. I can "unstick" the machine if I am able to echo "0" into tracing_enable from one of my shells. This is beginning to smell like a problem wherein we re-enter the tracer from the tracer and for some reason we can't get out of the cycle. Maybe I forgot to annotate a helper function or file on sparc64 to elide mcount calls. Anyways, that is one possibility. But, wince we see corruptions sometimes too, this could also point to some issue with the sparc64 specific ftrace assembler stubs and instruction patching. I'll dig further. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html