RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED

"Li, Xin3" <xin3.li@xxxxxxxxx> · Fri, 8 Dec 2023 23:48:37 +0000

> >> >> > Exiting-event identification can also have bit 13 set, indicating a
> >> >> > nested exception encountered and caused VM-exit. when reinjecting the
> >> >> > exception to guests, kvm needs to set the "nested" bit, right? I
> >> >> > suspect some changes to e.g., handle_exception_nmi() are needed.
> >> >>
> >> >> The current patch relies on kvm_multiple_exception() to do that.  But TBH,
> I'm
> >> >> not sure it can recognize all nested cases.  I probably should revisit it.
> >> >
> >> >So the conclusion is that kvm_multiple_exception() is smart enough, and
> >> >a VMM doesn't have to check bit 13 of the Exiting-event identification.
> >> >
> >> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
> >> >Support, there is a statement at the end of Exiting-event identification:
> >> >
> >> >(The value of this bit is always identical to that of the valid bit of
> >> >the original-event identification field.)
> >> >
> >> >It means that even w/o VMX Nested-Exception support, a VMM already
> knows
> >> >if an exception is a nested exception encountered during delivery of
> >> >another event in an exception caused VM exit (exit reason 0).  This is
> >> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
> >> >vmx_complete_interrupts() immediately after VM exits.
> >> >
> >> >vmx_complete_interrupts() simply queues the original exception if there is
> >> >one, and later the nested exception causing the VM exit could be cancelled
> >> >if it is a shadow page fault.  However if the shadow page fault is caused
> >> >by a guest page fault, KVM injects it as a nested exception to have guest
> >> >fix its page table.
> >> >
> >> >I will add comments about this background in the next iteration.
> >>
> >> is it possible that the CPU encounters an exception and causes VM-exit during
> >> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
> >> by vmx_complete_interrupts().
> >
> >I guess the following case is what you're suggesting:
> >KVM injects an external interrupt after shadow page tables are nuked.
> >
> >vmx_complete_interrupts() are called after each VM exit to clear both
> >interrupt and exception queues, which means it always pushes the
> >deepest event if there is an original event.  In the above case, the
> >original event is the external interrupt KVM just tried to inject.
> 
> in my understanding, your point is:
> 1. if bit 13 of the Exiting-event identification is set. the original-event
> identification field should be valid.
> 2. vmx_complete_interrupts() is done immediately after VM exits and reads
> original-event identification and reinjects the event there.
> 3. if KVM injects the exception in exiting-event identification
> to guest, KVM doesn't need to read the bit 13 because kvm_multiple_exception()
> is "smart enough" and recognize the exception as nested-exception because if
> bit 13 is 1, one exception must has been queued in #2.
> 
> my question is:
> what if the event in original-event identification is an interrupt e.g.,
> external interrupt or NMI, rather than exception.  vmx_complete_interrupts()
> won't queue an exception, then how can KVM or kvm_multiple_exception()
> know the
> exception that caused VM-exit is an nested exception w/o reading bit 13 of the
> Exiting-event identification?

The good news is that vmx_complete_interrupts() still queues the event
even it's not a hardware exception.  It's just that kvm_multiple_exception()
doesn't check if there is an original interrupt or NMI because IDT event
delivery doesn't care such a case.

I think your point is more of that we should check it when FRED is enabled
for a guest.  Yes, architecturally we should do it.

What I want to emphasize is that bit 13 of the exiting-event identification
is set to the valid bit of the original-event identification, they are
logically the same thing when FRED is enabled.  It doens't matter which one
a VMM reads and uses.  But a VMM doesn't need to differentiate FRED and IDT
if it reads the info from original-event identification.