On Thu, Feb 27, 2020 at 12:08:55PM -0800, Krish Sadhukhan wrote: > > On 2/27/20 9:44 AM, Sean Christopherson wrote: > >Consult only the basic exit reason, i.e. bits 15:0 of vmcs.EXIT_REASON, > >when determining whether a nested VM-Exit should be reflected into L1 or > >handled by KVM in L0. > > > >For better or worse, the switch statement in nested_vmx_exit_reflected() > >currently defaults to "true", i.e. reflects any nested VM-Exit without > >dedicated logic. Because the case statements only contain the basic > >exit reason, any VM-Exit with modifier bits set will be reflected to L1, > >even if KVM intended to handle it in L0. > > > >Practically speaking, this only affects EXIT_REASON_MCE_DURING_VMENTRY, > >i.e. a #MC that occurs on nested VM-Enter would be incorrectly routed to > >L1, as "failed VM-Entry" is the only modifier that KVM can currently > >encounter. The SMM modifiers will never be generated as KVM doesn't > >support/employ a SMI Transfer Monitor. Ditto for "exit from enclave", > >as KVM doesn't yet support virtualizing SGX, i.e. it's impossible to > >enter an enclave in a KVM guest (L1 or L2). > > > It seems nested_vmx_exit_reflected() deals only with the basic exit reason. > If it doesn't need anything beyond bits 15:0, may be vmx_handle_exit() can > pass just the base exit reason ? Argh. I was going to simply respond with "It traces exit_reason via trace_kvm_nested_vmexit().", but then I looked at the tracing code :-( The tracepoints that print the names of the VM-Exit are flawed in the sense that they'll always print the raw value for VM-Exits with modifiers. E.g. a consistency check VM-Exit on invalid guest state will print 0x80000021 instead of INVALID_STATE. Stripping bits 31:16 when invoking the tracepoint would fix the immediate issue, but I'm not sure I like that approach because doing so drops information that could potentially be quite helpful, e.g. if nested VM-Exit injection injected EXIT_REASON_MSR_LOAD_FAIL without also setting VMX_EXIT_REASONS_FAILED_VMENTRY, which could break/confuse the L1 VMM. I'm also not remotely confident that we won't screw this up again in the future :-) So part of me thinks the best way to resolve the printing would be to modify VMX_EXIT_REASONS to do "| VMX_EXIT_REASONS_FAILED_VMENTRY" where appropriate, i.e. on INVALID_STATE, MSR_LOAD_FAIL and MCE_DURING_VMENTRY. The downside of that approach is it breaks again when new modifiers come along, e.g. SGX's ENCLAVE_EXIT. But again, the modifier is likely useful information. I think the most foolproof and informative way to handle this would be to add a macro and/or helper function, e.g. kvm_print_vmx_exit_reason(), to wrap __print_symbolic(__entry->exit_code, VMX_EXIT_REASONS) so that it prints both the name of the basic exit reason as well as the names for any modifiers. TL;DR: I still like this patch as is, especially since it'll be easy to backport. I'll send a separate patch for the tracepoint issue.