On Tue, 2007-09-11 at 11:06 -0700, Daniel Walker wrote: > On Tue, 2007-09-11 at 11:39 -0600, David Bahi wrote: > > On Tue, 2007-09-11 at 02:10 +0000, David Bahi wrote: > > > trying to get a crashdump in a clean -rt series kernel which hardlocks > > > on the ltp openposix conformance interfaces sched_yield 1-1.test ... > > > > > > i was trying with nmi_watchdog = 1 (x86_64 machine, dual cpu, quad core, > > > HT enabled) and it would report NMI LOCKUP on presentation of the login > > > prompt - no chance to run the test. > > > > > > linux-2.6.22.1-rt9 linux-2.6.23-rc4-rt1 > > > lockup on openposix > > > sched_yield test occurs yes yes > > > > > > crash on login prompt > > > with nmi_watchdog=1 yes* yes* > > > > > > (*) console logs attached > > > > > > > > > > fyi 2.6.23-rc4 w/o -rt passes the LTP openposix sched_yeild test and > > does *not* crash (NMI LOCKUP) if nmi_watchdog=1 is a boot arg. > > > > so these don't seem to be related to what i'm seeing at least: > > > > http://thread.gmane.org/gmane.linux.kernel/577449 > > The patches above are related to nmi_watchdog=2 . Is the same lock up > detected with nmi_watchdog=2 ? sorry to pull your work in to this thread wrongly Daniel. my point was really that the 23-rc4 kernel does not experience either failure for me and that the current work being done for nmi_watchdog is unrelated so this still needs attention. no hang with nmi_watchdog=2 in 23-rc4 (non -rt) no hang with nmi_watchdog=2 in 23-rc4-rt1 test box has dual quad xenons - not a core duo - so the coreduo_ed_ops work around isn't needed, right? and the endflag=1 isn't needed since x86_64 sets this in both code paths (inefficiently). if (!atomic_read(&nmi_active)) { kfree(counts); atomic_set(&nmi_active, -1); endflag = 1; return -1; } endflag = 1; printk("OK.\n"); finally, i didn't try nmi_watchdog=2 earlier because the Documentation/nmi_watchdog.txt file says it's not a useful setting for x86_64. quote: For x86-64, the needed APIC is always compiled in, and the NMI watchdog is always enabled with I/O-APIC mode (nmi_watchdog=1). Currently, local APIC mode (nmi_watchdog=2) does not work on x86-64. Is this no longer true? My experience with nmi_watchdog=2 and this LTP openposix sched_yeild 1-1.test is that this test hardlocks the host and that no watchdog is triggered with this setting. db
Attachment:
signature.asc
Description: PGP signature