Re: [PATCH v4 3/4] KVM: nSVM: implement the nested idle halt intercept

Sean Christopherson <seanjc@xxxxxxxxxx> · Thu, 19 Dec 2024 17:01:43 -0800

On Tue, Oct 22, 2024, Manali Shukla wrote:
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index d5314cb7dff4..feb241110f1a 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -178,6 +178,14 @@ void recalc_intercepts(struct vcpu_svm *svm)
>  	} else {
>  		WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK));
>  	}
> +
> +	/*
> +	 * Clear the HLT intercept for L2 guest when the Idle HLT intercept feature
> +	 * is enabled on the platform and the guest can use the Idle HLT intercept
> +	 * feature.
> +	 */
> +	if (guest_can_use(&svm->vcpu, X86_FEATURE_IDLE_HLT))
> +		vmcb_clr_intercept(c, INTERCEPT_HLT);

This is wrong.  KVM needs to honor the intercept of vmcb12.  If L1 wants to
intercept HLT, then KVM needs to configure vmcb02 to intercept HLT, regradless
of whether or not L1 is utilizing INTERCEPT_IDLE_HLT.

Given how KVM currently handles intercepts for nested SVM, I'm pretty sure you
can simply do nothing.  recalc_intercepts() starts with KVM's intercepts (from
vmcb01), and adds in L1's intercepts.  So unless there is a special case, the
default behavior should Just Work.

	for (i = 0; i < MAX_INTERCEPT; i++)
		c->intercepts[i] = h->intercepts[i];

	...

	for (i = 0; i < MAX_INTERCEPT; i++)
		c->intercepts[i] |= g->intercepts[i];

KVM's approach creates all kinds of virtualization holes, e.g. L1 can utilize
IDLE_HLT even if the feature isn't advertised to L1.  But that's true for quite
literally all feature-based intercepts, so for better or worse, I don't think
it makes sense to try and change that approach for this feature.

> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e86b79e975d3..38d546788fc6 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4425,6 +4425,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
> +	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IDLE_HLT);
>  
>  	svm_recalc_instruction_intercepts(vcpu, svm);
>  
> @@ -5228,6 +5229,9 @@ static __init void svm_set_cpu_caps(void)
>  		if (vnmi)
>  			kvm_cpu_cap_set(X86_FEATURE_VNMI);
>  
> +		if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT))
> +			kvm_cpu_cap_set(X86_FEATURE_IDLE_HLT);

kvm_cpu_cap_check_and_set() does this for you.

> +
>  		/* Nested VM can receive #VMEXIT instead of triggering #GP */
>  		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>  	}
> -- 
> 2.34.1
>