On 12/28/2009 11:30 PM, Marcelo Tosatti wrote:
On Mon, Dec 28, 2009 at 10:47:20PM +0200, Avi Kivity wrote:
On 12/28/2009 10:37 PM, Marcelo Tosatti wrote:
On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote:
When the guest acknowledges an interrupt, it sends an EOI message to the local
apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
the ioapic mutex.
On large guests, this causes a lot of contention on this mutex. Since large
guests usually don't route interrupts via the ioapic (they use msi instead),
this is completely unnecessary.
Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
the mutex, check if the ioapic was actually responsible for the acked vector.
If not, we can return early.
Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC
code), instead?
That's a lot cleaner, yes. Indeed there's the TMR which holds this
info. Gleb suggested doing this in the local apic but we didn't think
of using the TMR.
Problem with storing in the LAPIC is you have to migrate the bitmap
along (otherwise can't know if EOI is from MSI or IOAPIC). But it sounds
much simpler.
If we move the vectors_handled bitmap to the local apic, I don't see how
it simplified things.
There's a small race there - the TMR is set after the IRR, so the
interrupt can be injected and acked before the TMR is updated, but that
can be fixed by switching the order.
Makes sense.
Btw, that race is already exposed to the guest, if it cares to read
TMR. I'll send a patch.
But what about kvm_notify_acked_irq() in __kvm_ioapic_update_eoi()?
Oops.
The worrying thing about the handled_vectors bitmap in the IOAPIC is
that the update is not atomic wrt to lapic EOI handler.
Unless its certain that races there are the guests problem, which should
have proper locking to never allow things like
kvm_set_ioapic vec
update handled bitmap, vec not IOAPIC
handled anymore
ack lapic irq vec
to happen.
(with bitmap in LAPIC you avoid those things).
It seems real hardware will have the same issue (also look at comments
regarding irq migration in arch/x86/kernel/io_apic.c). So I think a
guest is required to ack before migrating an irq.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html