Re: [PATCH] KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail

Sean Christopherson <sean.j.christopherson@xxxxxxxxx> · Tue, 31 Jul 2018 08:50:24 -0700

On Fri, 2018-07-27 at 21:48 -0700, Krish Sadhukhan wrote:
> 
> On 07/24/2018 06:31 AM, Sean Christopherson wrote:
> > 
> > On Mon, 2018-07-23 at 15:26 -0700, Krish Sadhukhan wrote:
> > > 
> > > On 07/19/2018 10:56 AM, Sean Christopherson wrote:
> > > > 

...

> > > > +	/*
> > > > +	 * This nasty bit of open coding is a compromise between blindly
> > > > +	 * loading L1's MSRs using the exit load lists (incorrect emulation
> > > > +	 * of VMFail), leaving the nested VM's MSRs in the software model
> > > > +	 * (incorrect behavior) and snapshotting the modified MSRs (too
> > > > +	 * expensive since the lists are unbound by hardware).  For each
> > > > +	 * MSR that was (prematurely) loaded from the nested VMEntry load
> > > > +	 * list, reload it from the exit load list if it exists and differs
> > > > +	 * from the guest value.  The intent is to stuff host state as
> > > > +	 * silently as possible, not to fully process the exit load list.
> > > > +	 */
> > > > +	msr.host_initiated = false;
> > > > +	for (i = 0; i < vmcs12->vm_entry_msr_load_count; i++) {
> > > > +		gpa = vmcs12->vm_entry_msr_load_addr + (i * sizeof(g));
> > > > +		if (kvm_vcpu_read_guest(vcpu, gpa, &g, sizeof(g))) {
> > > > +			pr_debug_ratelimited(
> > > > +				"%s read MSR index failed (%u, 0x%08llx)\n",
> > > > +				__func__, i, gpa);
> > > > +			goto vmabort;
> > > > +		}
> > > > +
> > > > +		for (j = 0; j < vmcs12->vm_exit_msr_load_count; j++) {
> > > > +			gpa = vmcs12->vm_exit_msr_load_addr + (j * sizeof(h));
> > > > +			if (kvm_vcpu_read_guest(vcpu, gpa, &h, sizeof(h))) {
> > > > +				pr_debug_ratelimited(
> > > > +					"%s read MSR failed (%u, 0x%08llx)\n",
> > > > +					__func__, j, gpa);
> > > > +				goto vmabort;
> > > > +			}
> > > > +			if (h.index != g.index)
> > > > +				continue;
> > > > +			if (h.value == g.value)
> > > > +				break;
> > > > +
> > > > +			if (nested_vmx_load_msr_check(vcpu, &h)) {
> > > > +				pr_debug_ratelimited(
> > > > +					"%s check failed (%u, 0x%x, 0x%x)\n",
> > > > +					__func__, j, h.index, h.reserved);
> > > > +				goto vmabort;
> > > > +			}
> > > > +
> > > > +			msr.index = h.index;
> > > > +			msr.data = h.value;
> > > > +			if (kvm_set_msr(vcpu, &msr)) {
> > > > +				pr_debug_ratelimited(
> > > > +					"%s WRMSR failed (%u, 0x%x, 0x%llx)\n",
> > > > +					__func__, j, h.index, h.value);
> > > > +				goto vmabort;
> > > > +			}
> > > > +		}
> > > > +	}
> > > > +
> > > Is it possible to re-use nested_vmx_load_msr() instead of duplicating
> > > the same loops ?
> >
> > We could use nested_vmx_load_msr() as-is to directly load on-exit list,
> > but that has the downside of potentially overwriting MSRs that should
> > not have been touched, or triggering side effects that should not have
> > occurred.  I considered modifying nested_vmx_load_msr() to splice in
> > this optional behavior, but the resulting code would be hideous and I
> > didn't want to convolute an otherwise straightforward function for this
> > one-off case.
>
> I agree.  Does it make sense to create a new function and call it, say, 
> nested_vmx_load_modified_msr or nested_vmx_load_select_msr which might 
> find a usage in another part of the code in future ?

I don't think so, processing the MSR load lists should be unique to
nested VMEnter.