Hi Thomas, On Fri, May 08, 2020 at 06:49:15PM +0200, Thomas Gleixner wrote: > Ashok, > > "Raj, Ashok" <ashok.raj@xxxxxxxxx> writes: > > With legacy MSI we can have these races and kernel is trying to do the > > song and dance, but we see this happening even when IR is turned on. > > Which is perplexing. I think when we have IR, once we do the change vector > > and flush the interrupt entry cache, if there was an outstandng one in > > flight it should be in IRR. Possibly should be clearned up by the > > send_cleanup_vector() i suppose. > > Ouch. With IR this really should never happen and yes the old vector > will catch one which was raised just before the migration disabled the > IR entry. During the change nothing can go wrong because the entry is > disabled and only reenabled after it's flushed which will send a pending > one to the new vector. with IR, I'm not sure if we actually mask the interrupt except when its a Posted Interrupt. We do an atomic update to IRTE, with cmpxchg_double ret = cmpxchg_double(&irte->low, &irte->high, irte->low, irte->high, irte_modified->low, irte_modified->high); followed by flushing the interrupt entry cache. After which any old ones in flight before the flush should be sittig in IRR on the outgoing cpu. The send_cleanup_vector() sends IPI to the apic_id->old_cpu which would be the cpu we are running on correct? and this is a self_ipi to IRQ_MOVE_CLEANUP_VECTOR. smp_irq_move_cleanup_interrupt() seems to check IRR with apicid_prev_vector() irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); if (irr & (1U << (vector % 32))) { apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR); continue; } And this would allow any pending IRR bits in the outgoing CPU to call the relevant ISR's before draining all vectors on the outgoing CPU. Does it sound right? I couldn't quite pin down how the device ISR's are hooked up through this send_cleanup_vector() and what follows.