Gleb Natapov wrote: > On Thu, May 14, 2009 at 09:43:33PM +0800, Dong, Eddie wrote: >> Avi Kivity wrote: >>> Dong, Eddie wrote: >>>> OK. >>>> Also back to Gleb's question, the reason I want to do that is to >>>> simplify event generation mechanism in current KVM. >>>> >>>> Today KVM use additional layer of exception/nmi/interrupt such as >>>> vcpu.arch.exception.pending, vcpu->arch.interrupt.pending & >>>> vcpu->arch.nmi_injected. All those additional layer is due to >>>> compete of VM_ENTRY_INTR_INFO_FIELD >>>> write to inject the event. Both SVM & VMX has only one resource to >>>> inject the virtual event but KVM generates 3 catagory of events in >>>> parallel which further requires additional >>>> logic to dictate among them. >>> >>> I thought of using a queue to hold all pending events (in a common >>> format), sort it by priority, and inject the head. >> >> The SDM Table 5-4 requires to merge 2 events together, i.e. convert >> to #DF/ >> Triple fault or inject serially when 2 events happens no matter NMI, >> IRQ or exception. >> >> As if considering above events merging activity, that is a single >> element queue. > I don't know how you got to this conclusion from you previous > statement. > See explanation to table 5-2 for instate where it is stated that > interrupt should be held pending if there is exception with higher > priority. Should be held pending where? In the queue, like we do. Note > that low prio exceptions are just dropped since they will be > regenerated. I have different understanding here. My understanding is that "held" means NO INTA in HW, i.e. LAPIC still hold this IRQ. > >> We could have either: 1) A pure SW "queue" that will be flush to HW >> register later (VM_ENTRY_INTR_INFO_FIELD), 2) Direct use HW register. >> > We have three event sources 1) exceptions 2) IRQ 3) NMI. We should > have > queue of three elements sorted by priority. On each entry we should Table 5-4 alreadys says NMI/IRQ is BENIGN. > inject an event with highest priority. And remove it from queue on > exit. The problem is that we have to decide to inject only one of above 3, and discard the rest. Whether priority them or merge (to one event as Table 5-4) is another story. > >> >> A potential benefit is that it can avoid duplicated code and >> potential bugs >> in current code as following patch shows if I understand correctly: >> >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -2599,7 +2599,7 @@ static int handle_exception(struct kvm_vcpu >> *vcpu, struct kvm_run *kvm_run) cr2 = >> vmcs_readl(EXIT_QUALIFICATION); >> KVMTRACE_3D(PAGE_FAULT, vcpu, >> error_code, (u32)cr2, (u32)((u64)cr2 >> 32), handler); - >> if (vcpu->arch.interrupt.pending || vcpu->arch.exception.pending ) + >> if (vcpu->arch.interrupt.pending || >> vcpu->arch.exception.pending || >> vcpu->arch.nmi_injected) kvm_mmu_unprotect_page_virt(vcpu, >> cr2); return kvm_mmu_page_fault(vcpu, cr2, error_code); } > This fix is already in Avi's tree (not yet pushed). > >> Either way are OK and up to you. BTW Xen uses HW register directly >> to representing >> an pending event. >> > In this particular case I don't mind to use HW register either, but I > don't see any advantage. > >>> >>>> One example is that exception has higher priority >>>> than NMI/IRQ injection in current code which is not true in >>>> reality. >>>> >>> >>> I don't think it matters in practice, since the guest will see it >>> as a timing issue. NMIs and IRQs are asynchronous (even those >>> generated by the guest through the local APIC). >> >> Yes. But also cause IRQ injection be delayed which may have side >> effect. >> For example if guest exception handler is very longer or if guest >> VCPU fall into recursive #GP. Within current logic, a guest IRQ >> event from KDB (IPI) running >> on VCPU0, as an example, can't force the dead loop VCPU1 into KDB >> since it >> is recursively #GP. > If one #GP causes another #GP this is a #DF. If CPU has a chance to Means another #GP in next instruction i.e. Beginning of #GP handler in guest. No #DF here. > executes > something in between KVM will have a chance to inject NMI. Could have no chance in some cases though not very common. > >> >>> >>>> Another issue is that an failed event from previous injection say >>>> IRQ or NMI may be discarded if an virtual exception happens in the >>>> EXIT handling now. With the patch of generic double fault handling, >>>> this case should be handled as normally. >>>> >>> >>> Discarding an exception is usually okay as it will be regenerated. >>> I don't think we discard interrupts or NMIs. >> In reality (Running OS in guest), it doesn't happen so far. But >> architecturally, >> it could. For example KVM injects an IRQ, but VM Resume get #PF and >> back to KVM with IDT_VECTORING valid. Then KVM will put back the >> failed >> IRQ to interrupt queue. But if #PF handling generates another >> exception, >> then the interrupt queue won't be able to be injected, since KVM >> inject >> exception first. And the interrupt queue is discarded at next VM >> Exit. >> > I acknowledge the presence of the bug although I was not able to > write a test case > to cause it yet, but it is easy to fix this without changing code too > much. Unified event queue and clearing of only injected event on exit > should do the trick. Yes, minor. Eddie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html