On Tue, Jul 30, 2013 at 03:01:27PM -0400, Vince Weaver wrote: > Hello > > so my perf_fuzzer has been causing problems again. > > After running a while all login shells on the system (even unrelated > local ones) get killed. Nothing is logged when this happens and it > doesn't appear to be OOM related. > > In an attempt to find out what was going on I ran the fuzzer with "nohup" > which led to the following NMI lockup which looks perf related. The > system became unusable after this. > > The first WARNING is I think a known issue but I'm including it in the > dump in case it is related. It's the NMI lockup that is the problem. > > There was possibly some sort of RCU message printed to the screen also > that didn't make it to the logs but I wasn't able to write it down in > time. > > This is on a recent ivybridge mac-mini running 3.11-rc3 > > Jul 30 11:08:28 mac-mini kernel: [ 651.209212] hrtimer: interrupt took 1152 ns > Jul 30 11:08:50 mac-mini kernel: [ 673.441360] perf samples too long (2557 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 > Jul 30 11:08:58 mac-mini kernel: [ 680.886547] perf samples too long (5003 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 > Jul 30 11:08:58 mac-mini kernel: [ 681.401917] perf samples too long (10002 > 10000), lowering kernel.perf_event_max_sample_rate to 12500 Interesting, saw a similar thing today while running perf top --stdio -a [47314.677201] perf samples too long (2505 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 [47314.686347] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.148 msecs [47315.946675] perf samples too long (5009 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 [47315.955825] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.154 msecs [47391.116117] Uhhuh. NMI received for unknown reason 21 on CPU 0. [47391.122034] Do you have a strange power saving mode enabled? [47391.127731] Dazed and confused, but trying to continue [53627.692616] Uhhuh. NMI received for unknown reason 31 on CPU 0. [53627.698547] Do you have a strange power saving mode enabled? [53627.704202] Dazed and confused, but trying to continue [64212.289657] usb 1-1.2: USB disconnect, device number 4 along with strange "forgotten" NMIs firing later. Machine is still running normally after that though. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe trinity" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html