Re: [PATCH] KVM: x86: Prevent L0 VMM from modifying L2 VM registers via ioctl

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Fri, 17 May 2024 18:51:50 +0200

On 5/17/24 13:37, Liang Chen wrote:

The attached cleaned up reproducer shows that the problem is simply that
EFLAGS.VM is set in 64-bit mode.  To fix it, it should be enough to do
a nested_vmx_vmexit(vcpu, EXIT_REASON_TRIPLE_FAULT, 0, 0); just like
a few lines below.

Yes, that was the situation we were trying to deal with. However, I am
not quite sure if I fully understand the suggestion, "To fix it, it
should be enough to do a nested_vmx_vmexit(vcpu,
EXIT_REASON_TRIPLE_FAULT, 0, 0); just like a few lines below.". From
what I see, "(vmx->nested.nested_run_pending, vcpu->kvm) == true" in
__vmx_handle_exit can be a result of an invalid VMCS12 from L1 that
somehow escaped checking when trapped into L0 in nested_vmx_run. It is
not convenient to tell whether it was a result of userspace
register_set ops, as we are discussing, or an invalid VMCS12 supplied
by L1.

Right, KVM assumes that it can delegate the "Checks on Guest Segment 
Registers" to the processor if a field is copied straight from VMCS12 to 
VMCS02.  In this case the segments are not set up for virtual-8086 mode;
interestingly the manual seems to say that EFLAGS.VM wins over "IA-32e 
mode guest" is 1 for the purpose of checking guest state.  AMD's manual 
says that EFLAGS.VM is completely ignored in 64-bit mode instead.

I need to look more at the sequence of VMLAUNCH/RESUME, KVM_SET_MSR and 
the failed vmentry to understand exactly what the right fix is.

Paolo

Additionally, nested_vmx_vmexit warns when
'vmx->nested.nested_run_pending is true,' saying that "trying to
cancel vmlaunch/vmresume is a bug".