> >> >> > Exiting-event identification can also have bit 13 set, indicating a > >> >> > nested exception encountered and caused VM-exit. when reinjecting the > >> >> > exception to guests, kvm needs to set the "nested" bit, right? I > >> >> > suspect some changes to e.g., handle_exception_nmi() are needed. > >> >> > >> >> The current patch relies on kvm_multiple_exception() to do that. But TBH, > I'm > >> >> not sure it can recognize all nested cases. I probably should revisit it. > >> > > >> >So the conclusion is that kvm_multiple_exception() is smart enough, and > >> >a VMM doesn't have to check bit 13 of the Exiting-event identification. > >> > > >> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception > >> >Support, there is a statement at the end of Exiting-event identification: > >> > > >> >(The value of this bit is always identical to that of the valid bit of > >> >the original-event identification field.) > >> > > >> >It means that even w/o VMX Nested-Exception support, a VMM already > knows > >> >if an exception is a nested exception encountered during delivery of > >> >another event in an exception caused VM exit (exit reason 0). This is > >> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling > >> >vmx_complete_interrupts() immediately after VM exits. > >> > > >> >vmx_complete_interrupts() simply queues the original exception if there is > >> >one, and later the nested exception causing the VM exit could be cancelled > >> >if it is a shadow page fault. However if the shadow page fault is caused > >> >by a guest page fault, KVM injects it as a nested exception to have guest > >> >fix its page table. > >> > > >> >I will add comments about this background in the next iteration. > >> > >> is it possible that the CPU encounters an exception and causes VM-exit during > >> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued > >> by vmx_complete_interrupts(). > > > >I guess the following case is what you're suggesting: > >KVM injects an external interrupt after shadow page tables are nuked. > > > >vmx_complete_interrupts() are called after each VM exit to clear both > >interrupt and exception queues, which means it always pushes the > >deepest event if there is an original event. In the above case, the > >original event is the external interrupt KVM just tried to inject. > > in my understanding, your point is: > 1. if bit 13 of the Exiting-event identification is set. the original-event > identification field should be valid. > 2. vmx_complete_interrupts() is done immediately after VM exits and reads > original-event identification and reinjects the event there. > 3. if KVM injects the exception in exiting-event identification > to guest, KVM doesn't need to read the bit 13 because kvm_multiple_exception() > is "smart enough" and recognize the exception as nested-exception because if > bit 13 is 1, one exception must has been queued in #2. > > my question is: > what if the event in original-event identification is an interrupt e.g., > external interrupt or NMI, rather than exception. vmx_complete_interrupts() > won't queue an exception, then how can KVM or kvm_multiple_exception() > know the > exception that caused VM-exit is an nested exception w/o reading bit 13 of the > Exiting-event identification? The good news is that vmx_complete_interrupts() still queues the event even it's not a hardware exception. It's just that kvm_multiple_exception() doesn't check if there is an original interrupt or NMI because IDT event delivery doesn't care such a case. I think your point is more of that we should check it when FRED is enabled for a guest. Yes, architecturally we should do it. What I want to emphasize is that bit 13 of the exiting-event identification is set to the valid bit of the original-event identification, they are logically the same thing when FRED is enabled. It doens't matter which one a VMM reads and uses. But a VMM doesn't need to differentiate FRED and IDT if it reads the info from original-event identification.