On Sun, Feb 17, 2013 at 06:39:51PM +0100, Jan Kiszka wrote: > On 2013-02-17 18:35, Gleb Natapov wrote: > > On Sun, Feb 17, 2013 at 06:01:05PM +0100, Jan Kiszka wrote: > >> On 2013-02-17 17:26, Gleb Natapov wrote: > >>> On Sun, Feb 17, 2013 at 04:31:26PM +0100, Jan Kiszka wrote: > >>>> On 2013-02-17 16:07, Gleb Natapov wrote: > >>>>> On Sat, Feb 16, 2013 at 06:10:14PM +0100, Jan Kiszka wrote: > >>>>>> From: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> > >>>>>> > >>>>>> If L1 does not set PIN_BASED_EXT_INTR_MASK, we incorrectly skipped > >>>>>> vmx_complete_interrupts on L2 exits. This is required because, with > >>>>>> direct interrupt injection from L0 to L2, L0 has to update its pending > >>>>>> events. > >>>>>> > >>>>>> Also, we need to allow vmx_cancel_injection when entering L2 in we left > >>>>>> to L0. This condition is indirectly derived from the absence of valid > >>>>>> vectoring info in vmcs12. We no explicitly clear it if we find out that > >>>>>> the L2 exit is not targeting L1 but L0. > >>>>>> > >>>>> We really need to overhaul how interrupt injection is emulated in nested > >>>>> VMX. Why not put pending events into event queue instead of > >>>>> get_vmcs12(vcpu)->idt_vectoring_info_field and inject them in usual way. > >>>> > >>>> I was thinking about the same step but felt unsure so far if > >>>> vmx_complete_interrupts & Co. do not include any assumptions about the > >>>> vmcs configuration that won't match what L1 does. So I went for a > >>>> different path first, specifically to avoid impact on these hairy bits > >>>> for non-nested mode. > >>>> > >>> Assumption made by those functions should be still correct since guest > >>> VMCS configuration is not applied directly to real HW, but we should be > >>> careful of course. For instance interrupt queues should be cleared > >>> during nested vmexit and event transfered back to idt_vectoring_info_field. > >>> IIRC this is how nested SVM works BTW. > >> > >> Checking __vmx_complete_interrupts, the first issue I find is that type > >> 5 (privileged software exception) is not decoded, thus will be lost if > >> L2 leaves this way. That's a reason why it might be better to re-inject > >> the content of vmcs12 if it is valid. VMX is a bit more hairy than SVM, > >> I guess. > >> > > I do not see type 5 in SDM Table 24-15. We handle every type specified > > there. Why shouldn't we? SVM and VMX are pretty close in regards to > > event injection, this allowed us to move a lot of logic into the common > > code. > > It's a type relevant for event delivery, see 24-16. > > I think we only handle what we can possibly generate. This assumption > would have to be checked and potentially resolved before we can use the > standard code for nesting as well. > I cannot find what can generate "privileged software exception" exit, but on XEN ML I found that undocumented 0xf1 opcode (ICEBP) does it. We should handle it regardless of nested vmx. Your patch already calls __vmx_complete_interrupts() on nested idt_vectoring_info, so all potential problem that it may cause should be addressed anyway. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html