On Thu, 15 Apr 2010, Thomas Gleixner wrote:
Can you try nmi_watchdog=1 ? That keeps the tickless mode alive.
It was nmi_watchdog=1 that turned off tickless. Perhaps you mean, nmi_watchdog=2? I haven't tried that.
Does the problem reproduce when you disable those boards ?
The real-time code that apparently causes the lockups is controlling hardware continuously via those commercial boards. So unfortunately, if I disable them, then I can't run the code that is causing the problem. In particular, they provide the hardware interrupts that drive the code, and servo feedback that determines what the code does next.
Do you have the source of the drivers ?
Yes, I wrote my own drivers for these boards. So this ought to be easy to solve, if I knew what to look for in my code. Yesterday I found one thing that might be a problem, and I hope to get a chance to test this today. In one of my two interrupt threads, I was calling wake_up_interruptible() before writing the PCI registers that clear the interrupt on the board. When I wrote this, I assumed that the interrupt handler would always finish before the scheduler came back into play, but I am wondering whether this is still true with threaded interrupts? Note that the user-land thread that is woken by the wake-up runs at the same real-time priority as the interrupt thread (another mistake?).
Did you ever run with lockdep enabled (CONFIG_PROVE_LOCKING=y) ?
No. Sorry, I hadn't noticed that option. I will turn it on before running the code again. Thank you for your help, Martin -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html