On Mon, 2023-04-03 at 08:02 -0700, Sean Christopherson wrote: > On Mon, Apr 03, 2023, Huang, Kai wrote: > > > > > > I checked the code again and find the comment of > > > nested_vmx_check_permission(). > > > > > > "/* > > > �* Intel's VMX Instruction Reference specifies a common set of > > > prerequisites > > > �* for running VMX instructions (except VMXON, whose prerequisites are > > > �* slightly different). It also specifies what exception to inject > > > otherwise. > > > �* Note that many of these exceptions have priority over VM exits, so they > > > �* don't have to be checked again here. > > > �*/" > > > > > > I think the Note part in the comment has tried to callout why the check > > > for compatibility mode is unnecessary. > > > > > > But I have a question here, nested_vmx_check_permission() checks that the > > > vcpu is vmxon, otherwise it will inject a #UD. Why this #UD is handled in > > > the VMExit handler specifically? Not all #UDs have higher priority than VM > > > exits? > > > > > > According to SDM Section "Relative Priority of Faults and VM Exits": > > > "Certain exceptions have priority over VM exits. These include > > > invalid-opcode exceptions, ..." > > > Seems not further classifications of #UDs. > > > > This is clarified in the pseudo code of VMX instructions in the SDM. If you > > look at the pseudo code, all VMX instructions except VMXON (obviously) have > > something like below: > > > > IF (not in VMX operation) ... > > THEN #UD; > > ELSIF in VMX non-root operation > > THEN VMexit; > > > > So to me "this particular" #UD has higher priority over VM exits (while other > > #UDs may not). > > > But IIUC above #UD won't happen when running VMX instruction in the guest, > > because if there's any live guest, the CPU must already have been in VMX > > operation. So below check in nested_vmx_check_permission(): > > > > if (!to_vmx(vcpu)->nested.vmxon) { > > kvm_queue_exception(vcpu, UD_VECTOR); > > return 0; > > } > > > > is needed to emulate the case that guest runs any other VMX instructions before > > VMXON. > > Yep. IMO, the pseucode is misleading/confusing, the "in VMX non-root operation" > check should really come first. The VMXON pseudocode has the same awkward > sequence: > > IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ... > THEN #UD; > ELSIF not in VMX operation > THEN > IF (CPL > 0) or (in A20M mode) or > (the values of CR0 and CR4 are not supported in VMX operation) > THEN #GP(0); > ELSIF in VMX non-root operation > THEN VMexit; > ELSIF CPL > 0 > THEN #GP(0); > ELSE VMfail("VMXON executed in VMX root operation"); > FI; > > > whereas I find this sequence for VMXON more representative of what actually happens: > > IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ... > THEN #UD > > IF in VMX non-root operation > THEN VMexit; > > IF CPL > 0 > THEN #GP(0) > > IF in VMX operation > THEN VMfail("VMXON executed in VMX root operation"); > > IF (in A20M mode) or > (the values of CR0 and CR4 are not supported in VMX operation) > THEN #GP(0); Perhaps we need to live with the fact that the pseudo code in the SDM can be buggy :)