Suresh Siddha <suresh.b.siddha@xxxxxxxxx> writes: > On Tue, 2009-06-30 at 12:36 -0700, Eric W. Biederman wrote: >> Dropped irqs.. Driver hangs because it is waiting for an irq. Hardware >> hangs because it is waiting for the cpu to process the irq. >> >> Potentially we get a level triggered irq that is never acked by >> the cpu that won't arm until the cpu send an ack, and we can't >> send an ack from another cpu. > > Eric, > > Among number of experiments you have tried in the past to fix this, have > you tried the experiment of explicitly clearing the remoteIRR by > changing the trigger mode to edge and then back to level. > > Is there a problem with this? The problem I had wasn't remoteIRR getting stuck, but the symptoms were largely the same. I did try changing the trigger mode to edge and back and that did not unstick the ioapic in all cases. > We can send a spurious IPI (after the RTE migration) with the new vector > to another cpu and handler which services the level interrupt will check > if we saw the edge mode for a level interrupt and then the handler can > explicitly restore the level trigger and reset the remote IRR by mask > +edge and unmask+level. > > We might have to work with some rough edges but do you recollect any > major issue with this approach.. This is coming up enough recently I expect it is time to cook up a patch that does the ioapic migration in process context plus some user space code that stress tests things. Just so people can repeat my experiments and see what I am trying to avoid. Eric -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html