Marcelo Tosatti wrote on 2013-02-25: > On Sun, Feb 24, 2013 at 01:55:07PM +0000, Zhang, Yang Z wrote: >>> contexts, but only two use locks. >> See following logic, I think the problem you mentioned will not >> happened with current logic. >> >> get lock >> if test_pir (this will ensure there is no in flight IPI for same interrupt. Since we > are taking the lock now, no IPI will be sent before we release the lock and no > pir->irr is performed by hardware for same interrupt.) >> return coalesced if test_irr return coalesced >> set_pir >> injected=true >> if t_a_s(on) && in guest mode >> send ipi >> unlock >> >> >>>>> I'd rather think about proper way to do lockless injection before >>>>> committing anything. The case where we care about correct injection >>>>> status is rare, but we always care about scalability and since we >>>>> violate the spec by reading vapic page while vcpu is in non-root >>>>> operation anyway the result may be incorrect with or without the lock. >>>>> Our use can was not in HW designers mind when they designed this thing >>>>> obviously :( >>>> >>>> Zhang, can you comment on whether reading vapic page with CPU in >>>> VMX-non root accessing it is safe? >> See 24.11.4 In addition to data in the VMCS region itself, VMX non-root >> operation can be controlled by data structures that are referenced by >> pointers in a VMCS (for example, the I/O bitmaps). While the pointers >> to these data structures are parts of the VMCS, the data structures >> themselves are not. They are not accessible using VMREAD and VMWRITE >> but by ordinary memory writes. Software should ensure that each such >> data structure is modified only when no logical processor with a >> current VMCS that references it is in VMX non-root operation. Doing >> otherwise may lead to unpredictable behavior. >> >> This means the initial design of KVM is wrong. It should not to modify >> vIRR directly. >> >> The good thing is that reading is ok. > > OK. > >>>> Gleb, yes, a comment mentioning the race (instead of the spinlock) and >>>> explanation why its believed to be benign (given how the injection >>>> return value is interpreted) could also work. Its ugly, though... murphy >>>> is around. >>> The race above is not benign. It will report interrupt as coalesced >>> while in reality it is injected. This may cause to many interrupt to be >>> injected. If this happens rare enough ntp may be able to fix time drift >>> resulted from this. >> Please check the above logic. Does it have same problem? If yes, please point > out. > > 1) set_pir must be test_and_set_bit() (so the lock at vcpu entry path can > be dropped). > > 2) if t_a_s(on) && in guest mode > send ipi > should be inverted: > > if guest mode && t_a_s(on) > send ipi > So you avoid setting ON bit if guest is outside of guest mode. Need the ON bit to track whether there are pending bit in pir. Or else, must traverse pir even it is empty when calling sync_pir_to_irr. Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html