From: Andi Kleen <andi@xxxxxxxxxxxxxx> Date: Wed, 24 Jun 2009 09:03:15 +0200 >> I'm not exactly sure what to do about this. > > Ack the timer interrupt earlier (and also give it a high priority?) It has a higher priority, but all interrupts get re-enabled right before we process software interrupts. So the flood of qla2xxx interrupts can come in before we can run the timer softirq and thus schedule the next timer interrupt. > That could be still problematic if you have non nestabled irq stacks > (haven't checked if sparc has that or not), > potentially you might need to run the softirq on the process stack. IRQ stacks on sparc64 work identically to how they do on x86. I have some more theories about this, in that I always see the NMI watchdog message with a PC right in the section of CPU idle where NOHZ is enabled. On these cpus there is no support yielding, so on them I just touch the NMI watchdog in the loop waiting for need_resched() to become true. But if we get the qla2xxx interrupt storm during that loop, it's pretty easy to not touch the NMI watchdog in time. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html