On Sat, Feb 23, 2013 at 05:31:44PM +0200, Gleb Natapov wrote: > On Sat, Feb 23, 2013 at 11:48:54AM -0300, Marcelo Tosatti wrote: > > > > > 1. orig_irr = read irr from vapic page > > > > > 2. if (orig_irr != 0) > > > > > 3. return false; > > > > > 4. kvm_make_request(KVM_REQ_EVENT) > > > > > 5. bool injected = !test_and_set_bit(PIR) > > > > > 6. if (vcpu->guest_mode && injected) > > > > > 7. if (test_and_set_bit(PIR notification bit)) > > > > > 8. send PIR IPI > > > > > 9. return injected > > > > > > > > Consider follow case: > > > > vcpu 0 | vcpu1 > > > > send intr to vcpu1 > > > > check irr > > > > receive a posted intr > > > > pir->irr(pir is cleared, irr is set) > > > > injected=test_and_set_bit=true > > > > pir=set > > > > > > > > Then both irr and pir have the interrupt pending, they may merge to one later, but injected reported as true. Wrong. > > > > > > > I and Marcelo discussed the lockless logic that should be used here on > > > the previous patch submission. All is left for you is to implement it. > > > We worked hard to make irq injection path lockless, we will not going to > > > add one now. > > > > He is right, the scheme is still flawed (because of concurrent injectors > > along with HW in VMX non-root). I'd said lets add a spinlock think about > > lockless scheme in the meantime. > The logic proposed was (from that thread): > apic_accept_interrupt() { > if (PIR || IRR) > return coalesced; > else > set PIR; > } > > Which should map to something like: > if (!test_and_set_bit(PIR)) > return coalesced; HW transfers PIR to IRR, here. Say due to PIR IPI sent due to setting of a different vector. > if (irr on vapic page is set) > return coalesced; > > I do not see how the race above can happen this way. Other can though if > vcpu is outside a guest. My be we should deliver interrupt differently > depending on whether vcpu is in guest or not. Problem is with 3 contexes: two injectors and one vcpu in guest mode. Earlier on that thread you mentioned "The point is that we need to check PIR and IRR atomically and this is impossible." That would be one way to fix it. > I'd rather think about proper way to do lockless injection before > committing anything. The case where we care about correct injection > status is rare, but we always care about scalability and since we > violate the spec by reading vapic page while vcpu is in non-root > operation anyway the result may be incorrect with or without the lock. > Our use can was not in HW designers mind when they designed this thing > obviously :( Zhang, can you comment on whether reading vapic page with CPU in VMX-non root accessing it is safe? Gleb, yes, a comment mentioning the race (instead of the spinlock) and explanation why its believed to be benign (given how the injection return value is interpreted) could also work. Its ugly, though... murphy is around. OTOH spinlock is not the end of the world, can figure out something later (we've tried without success so far). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html