Re: nmi_watchdog in latest 22 and 23-rc -rt series

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2007-09-11 at 11:06 -0700, Daniel Walker wrote:
> On Tue, 2007-09-11 at 11:39 -0600, David Bahi wrote:
> > On Tue, 2007-09-11 at 02:10 +0000, David Bahi wrote:
> > > trying to get a crashdump in a clean -rt series kernel which hardlocks
> > > on the ltp openposix conformance interfaces sched_yield 1-1.test ...
> > > 
> > > i was trying with nmi_watchdog = 1 (x86_64 machine, dual cpu, quad core,
> > > HT enabled) and it would report NMI LOCKUP on presentation of the login
> > > prompt - no chance to run the test.
> > > 
> > >                            linux-2.6.22.1-rt9   linux-2.6.23-rc4-rt1
> > > lockup on openposix 
> > > sched_yield test occurs            yes                  yes
> > > 
> > > crash on login prompt
> > > with nmi_watchdog=1                yes*                 yes*
> > > 
> > >   (*) console logs attached
> > > 
> > > 
> > 
> > fyi 2.6.23-rc4 w/o -rt passes the LTP openposix sched_yeild test and
> > does *not* crash (NMI LOCKUP) if nmi_watchdog=1 is a boot arg.
> > 
> > so these don't seem to be related to what i'm seeing at least:
> > 
> >   http://thread.gmane.org/gmane.linux.kernel/577449
> 
> The patches above are related to nmi_watchdog=2 . Is the same lock up
> detected with nmi_watchdog=2 ?

sorry to pull your work in to this thread wrongly Daniel.  my point was
really that the 23-rc4 kernel does not experience either failure for me
and that the current work being done for nmi_watchdog is unrelated so
this still needs attention.

no hang with nmi_watchdog=2 in 23-rc4 (non -rt)
no hang with nmi_watchdog=2 in 23-rc4-rt1

test box has dual quad xenons - not a core duo - so the coreduo_ed_ops
work around isn't needed, right?

and the endflag=1 isn't needed since x86_64 sets this in both code paths
(inefficiently).

	if (!atomic_read(&nmi_active)) {
		kfree(counts);
		atomic_set(&nmi_active, -1);
		endflag = 1;
		return -1;
	}
	endflag = 1;
	printk("OK.\n");


finally, i didn't try nmi_watchdog=2 earlier because the
Documentation/nmi_watchdog.txt file says it's not a useful setting for
x86_64.  quote:

  For x86-64, the needed APIC is always compiled in, and the NMI
  watchdog is always enabled with I/O-APIC mode (nmi_watchdog=1).   
  Currently, local APIC mode (nmi_watchdog=2) does not work on x86-64.

Is this no longer true?  My experience with nmi_watchdog=2 and this LTP
openposix sched_yeild 1-1.test is that this test hardlocks the host and
that no watchdog is triggered with this setting.

db



Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux