Re: NMI watchdog + NOHZ question

David Miller <davem@xxxxxxxxxxxxx> · Tue, 23 Jun 2009 17:17:35 -0700 (PDT)

From: David Miller <davem@xxxxxxxxxxxxx>
Date: Mon, 22 Jun 2009 02:27:52 -0700 (PDT)

> I think something else is afoot, because while using "nohz=off" makes
> the problem go away, simply adding a NMI watchdog touch after the
> schedule() call in cpu_idle() does not make the problem go away.
> 
> Also, the cpu that gets the NMI watchdog is different from the cpu
> running the qla2xxx driver init.   That basically destroys the bulk
> of my theory :-)

Ok, I think I know what's happening now.

CPU 0 is in the driver init and looping submitting mailbox
commands to load the firmware, then waiting for completion.

CPU 1 is receiving the device interrupts.  CPU 1 is where the
NMI watchdog triggers.

CPU 0 is submitting mailbox commands fast enough that by the
time CPU 1 returns from the device interrupt handler, a new
one is pending.  This sequence runs for more than 5 seconds.

The problematic case is CPU 1's timer interrupt running when
the barrage of device interrupts begin.  Then we have:

	timer interrupt
	return for softirq checking
	pending, thus enable interrupts

		 qla2xxx interrupt
		 return
		 qla2xxx interrupt
		 return
		 ... 5+ seconds pass
		 final qla2xxx interrupt for fw load
		 return

	run timer softirq
	return

At some point in the multi-second qla2xxx interrupt storm we trigger
the NMI watchdog on CPU 1 from the NMI interrupt handler.

The timer softirq, once we get back to running it, is smart enough to
run the timer work enough times to make up for the missed timer
interrupts.

However, the NMI watchdogs (both x86 and sparc) use the timer
interrupt count to notice the cpu is wedged.  But in the above
scenerio we'll receive only one such timer interrupt even if we
last all the way back to running the timer softirq.

I'm not exactly sure what to do about this.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html