On Tue, 2007-09-11 at 12:36 -0700, Daniel Walker wrote: > On Tue, 2007-09-11 at 13:16 -0600, David Bahi wrote: ...snip... > > I guess I'm confused what's happening .. It sounds like with > nmi_watchdog=1 , the system hangs and the watchdog catches it. With > nmi_watchdog=2 the system doesn't hang, and the watchdog doesn't catch > anything (assuming it's working)? the LTP yeild_sched/1-1.test lockup only happens with the -rt patch applied. trying to use nmi_watchdog=1 with -rt and it reports LOCKUP at *boot*. no chance to log in... no chance to run LTP test. the console logs attached at the beginning of this thread show this for latest -rt on both 22 and 23-rc trying to use nmi_watchdog=2 with -rt - does not LOCKUP at boot. but it also does not appear to be working. log in and run LTP test and system hardlocks again and the watchdog never detects LOCKUP -> panic -> crashdump (which is all set up with the hope to discover why the LTP test is hanging :) ...snip... > > For x86-64, the needed APIC is always compiled in, and the NMI > > watchdog is always enabled with I/O-APIC mode (nmi_watchdog=1). > > Currently, local APIC mode (nmi_watchdog=2) does not work on x86-64. > > > > Is this no longer true? My experience with nmi_watchdog=2 and this LTP > > openposix sched_yeild 1-1.test is that this test hardlocks the host and > > that no watchdog is triggered with this setting. > > It looks like it should work .. I'd be surprised if it didn't work.. You > can check if it's ticking in /proc/interrupts under NMI (it stops when > the system is idle tho) well... i've got counts - but like i said earlier it doesn't catch the LOCKUP when the LTP test hardlocks the host. luge:~ # cat /proc/interrupts |grep -Ee 'NMI|CPU' CPU0 CPU1 CPU2 CPU3 \ NMI: 1559 692 608 544 \ CPU4 CPU5 CPU6 CPU7 783 273 213 155 ah, but fortunately a serial console BREAK followed by 'c' dumps.... db
Attachment:
signature.asc
Description: PGP signature