On Fri, Feb 25, 2022 at 8:25 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote: > > On Fri, Feb 25, 2022 at 8:07 PM Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote: > > > > On 2/25/2022 11:13 PM, Paolo Bonzini wrote: > > > On 2/25/22 16:12, Xiaoyao Li wrote: > > >>>>> > > >>>> > > >>>> I don't like the idea of making things up without notifying userspace > > >>>> that this is fictional. How is my customer running nested VMs supposed > > >>>> to know that L2 didn't actually shutdown, but L0 killed it because the > > >>>> notify window was exceeded? If this information isn't reported to > > >>>> userspace, I have no way of getting the information to the customer. > > >>> > > >>> Then, maybe a dedicated software define VM exit for it instead of > > >>> reusing triple fault? > > >>> > > >> > > >> Second thought, we can even just return Notify VM exit to L1 to tell > > >> L2 causes Notify VM exit, even thought Notify VM exit is not exposed > > >> to L1. > > > > > > That might cause NULL pointer dereferences or other nasty occurrences. > > > > IMO, a well written VMM (in L1) should handle it correctly. > > > > L0 KVM reports no Notify VM Exit support to L1, so L1 runs without > > setting Notify VM exit. If a L2 causes notify_vm_exit with > > invalid_vm_context, L0 just reflects it to L1. In L1's view, there is no > > support of Notify VM Exit from VMX MSR capability. Following L1 handler > > is possible: > > > > a) if (notify_vm_exit available & notify_vm_exit enabled) { > > handle in b) > > } else { > > report unexpected vm exit reason to userspace; > > } > > > > b) similar handler like we implement in KVM: > > if (!vm_context_invalid) > > re-enter guest; > > else > > report to userspace; > > > > c) no Notify VM Exit related code (e.g. old KVM), it's treated as > > unsupported exit reason > > > > As long as it belongs to any case above, I think L1 can handle it > > correctly. Any nasty occurrence should be caused by incorrect handler in > > L1 VMM, in my opinion. > > Please test some common hypervisors (e.g. ESXi and Hyper-V). I took a look at KVM in Linux v4.9 (one of our more popular guests), and it will not handle this case well: if (exit_reason < kvm_vmx_max_exit_handlers && kvm_vmx_exit_handlers[exit_reason]) return kvm_vmx_exit_handlers[exit_reason](vcpu); else { WARN_ONCE(1, "vmx: unexpected exit reason 0x%x\n", exit_reason); kvm_queue_exception(vcpu, UD_VECTOR); return 1; } At least there's an L1 kernel log message for the first unexpected NOTIFY VM-exit, but after that, there is silence. Just a completely inexplicable #UD in L2, assuming that L2 is resumable at this point.