Re: [PATCH v3] KVM: VMX: Enable Notify VM exit

Xiaoyao Li <xiaoyao.li@xxxxxxxxx> · Fri, 25 Feb 2022 23:04:46 +0800

On 2/25/2022 10:54 PM, Jim Mattson wrote:
On Tue, Feb 22, 2022 at 10:19 PM Chenyi Qiang <chenyi.qiang@xxxxxxxxx> wrote:

From: Tao Xu <tao3.xu@xxxxxxxxx>

There are cases that malicious virtual machines can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs.

VMM can enable notify VM exit that a VM exit generated if no event
window occurs in VM non-root mode for a specified amount of time (notify
window).

Feature enabling:
- The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to
   enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust
   the expected notify window.
- Expose a module param to configure notify window by admin, which is in
   unit of crystal clock.
   - if notify_window < 0, feature disabled;
   - if notify_window >= 0, feature enabled;
- There's a possibility, however small, that a notify VM exit happens
   with VM_CONTEXT_INVALID set in exit qualification. In this case, the
   vcpu can no longer run. To avoid killing a well-behaved guest, set
   notify window as -1 to disable this feature by default.
- It's safe to even set notify window to zero since an internal
   hardware threshold is added to vmcs.notifiy_window.

What causes a VM_CONTEXT_INVALID VM-exit? How small is this possibility?

For now, no case will set VM_CONTEXT_INVALID bit.

In the future, it must be some fatal case that vmcs is corrupted.

Nested handling
- Nested notify VM exits are not supported yet. Keep the same notify
   window control in vmcs02 as vmcs01, so that L1 can't escape the
   restriction of notify VM exits through launching L2 VM.
- When L2 VM is context invalid, synthesize a nested
   EXIT_REASON_TRIPLE_FAULT to L1 so that L1 won't be killed due to L2's
   VM_CONTEXT_INVALID happens.

I don't like the idea of making things up without notifying userspace
that this is fictional. How is my customer running nested VMs supposed
to know that L2 didn't actually shutdown, but L0 killed it because the
notify window was exceeded? If this information isn't reported to
userspace, I have no way of getting the information to the customer.

Then, maybe a dedicated software define VM exit for it instead of 
reusing triple fault?