Re: [PATCH v2 02/49] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup

Maxim Levitsky <mlevitsk@xxxxxxxxxx> · Thu, 04 Jul 2024 20:51:41 -0400

On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> Explicitly perform runtime CPUID adjustments as part of the "after set
> CPUID" flow to guard against bugs where KVM consumes stale vCPU/CPUID
> state during kvm_update_cpuid_runtime().  E.g. see commit 4736d85f0d18
> ("KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT").
> 
> Whacking each mole individually is not sustainable or robust, e.g. while
> the aforemention commit fixed KVM's PV features, the same issue lurks for
> Xen and Hyper-V features, Xen and Hyper-V simply don't have any runtime
> features (though spoiler alert, neither should KVM).

> 
> Updating runtime features in the "full" path will also simplify adding a
> snapshot of the guest's capabilities, i.e. of caching the intersection of
> guest CPUID and kvm_cpu_caps (modulo a few edge cases).
> 
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  arch/x86/kvm/cpuid.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 2b19ff991ceb..e60ffb421e4b 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -345,6 +345,8 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	bitmap_zero(vcpu->arch.governed_features.enabled,
>  		    KVM_MAX_NR_GOVERNED_FEATURES);
>  
> +	kvm_update_cpuid_runtime(vcpu);
> +
>  	/*
>  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
>  	 * hardware.  The hardware page walker doesn't let KVM disable GBPAGES,
> @@ -426,8 +428,6 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
>  {
>  	int r;
>  
> -	__kvm_update_cpuid_runtime(vcpu, e2, nent);
> -
>  	/*
>  	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
>  	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
> @@ -440,6 +440,15 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
>  	 * whether the supplied CPUID data is equal to what's already set.
>  	 */
>  	if (kvm_vcpu_has_run(vcpu)) {
> +		/*
> +		 * Note, runtime CPUID updates may consume other CPUID-driven
> +		 * vCPU state, e.g. KVM or Xen CPUID bases.  Updating runtime
> +		 * state before full CPUID processing is functionally correct
> +		 * only because any change in CPUID is disallowed, i.e. using
> +		 * stale data is ok because KVM will reject the change.
> +		 */

If I understand correctly the sole reason for the below __kvm_update_cpuid_runtime
is to ensure that kvm_cpuid_check_equal doesn't fail because current cpuid also
was post-processed with runtime updates.

Can we have a comment stating this? Or even better how about moving the
call to __kvm_update_cpuid_runtime into the kvm_cpuid_check_equal,
to emphasize this?

> +		__kvm_update_cpuid_runtime(vcpu, e2, nent);
> +
>  		r = kvm_cpuid_check_equal(vcpu, e2, nent);
>  		if (r)
>  			return r;

Overall I am not 100% sure what is better:

Before the patch it was roughly like this:

1. Post process the user given cpuid with bits of KVM runtime state (like xcr0)
At that point the vcpu->arch.cpuid_entries is stale but consistent, it is just old CPUID.

2. kvm_hv_vcpu_init call (IMHO this call can be moved to kvm_vcpu_after_set_cpuid)

3. kvm_check_cpuid on the user provided cpuid

4. Update the vcpu->arch.cpuid_entries with new and post processed cpuid

5. kvm_get_hypervisor_cpuid - I think this also can be cosmetically moved to kvm_vcpu_after_set_cpuid

6. kvm_vcpu_after_set_cpuid itself.

After this change it works like that:

1. kvm_hv_vcpu_init (again this belongs more to kvm_vcpu_after_set_cpuid)
2. kvm_check_cpuid on the user cpuid without post processing - in theory this can cause bugs
3. Update the vcpu->arch.cpuid_entries with new cpuid but without post-processing
4. kvm_get_hypervisor_cpuid
5. kvm_update_cpuid_runtime
6. The old kvm_vcpu_after_set_cpuid

I'm honestly not sure what is better but IMHO moving the kvm_hv_vcpu_init and kvm_get_hypervisor_cpuid into
kvm_vcpu_after_set_cpuid would clean up this mess a bit regardless of this patch.

Best regards,
	Maxim Levitsky