Re: Debugging a hard lockup with no symptoms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 14 Apr 2010, Martin Shepherd wrote:

> I have been experiencing hard lockups running a real-time application
> under preempt-rt. Having originally had this problem while running
> under 2.6.29.4-rt16, today I upgraded to 2.6.31.12-rt21, but the
> problem persisted. Under both kernels, the computer simply freezes,
> usually after a few hours of otherwise flawless operation. Nothing
> appears on the serial console or in the system log when the system
> freezes. Unfortunately, turning on the NMI watchdog stops the freezes
> from occurring at all, such that I can't force an Oops that way.
> 
> I have tried running memtest86 on the RAM, without detecting any
> memory errors, and I have verified that the same problem occurs on two
> different (but essentially identical) computers.
> 
> I wonder whether there might be a clue in the fact that turning on the
> NMI watchdog stops the freezes from occuring. Turning on the watchdog
> unfortunately turns off tickless mode, which I need. According to the
> boot-time messages, tickless is turned off because the local APIC is
> non-functional (presumably because the NMI watchdog is using it). What
> kind of bugs would be more likely to be seen when running under
> tickless?

Can you try nmi_watchdog=1 ? That keeps the tickless mode alive.

> Could anybody give me any ideas on how to further debug this problem?
> I have been trying to figure this out for weeks, but I haven't found
> any clues.
> 
> In case it is important, the CPU is a 1.8GHz Intel Celeron, on a
> Foxconn motherboard with an Intel G31 chipset, and Intel GMA 3100
> onboard graphics. I am running the kernel (downloaded from kernel.org)
> under Unbuntu 9.10. The computer also hosts two commercial digital I/O
> boards, both generating interrupts, and one commercial analog I/O
> board.

Does the problem reproduce when you disable those boards ? Do you have
the source of the drivers ? Did you ever run with lockdep enabled
(CONFIG_PROVE_LOCKING=y) ?

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux