Manali Shukla <manali.shukla@xxxxxxx> writes: > From: Manali Shukla <Manali.Shukla@xxxxxxx> > > The hypervisor can intercept the HLT instruction by setting the > HLT-Intercept Bit in VMCB, causing a VMEXIT. This can be wasteful if > there are pending V_INTR and V_NMI events, as the hypervisor must then > initiate a VMRUN to handle them. > > If the HLT-Intercept Bit is cleared and the vCPU executes HLT while > there are pending V_INTR and V_NMI events, the hypervisor won’t detect > them, potentially causing indefinite suspension of the vCPU. This poses > a problem for enlightened guests who wish to securely handle the > events. > > For Secure AVIC scenarios, if a guest does a HLT while an interrupt is > pending (in IRR), the hypervisor does not have a way to figure out > whether the guest needs to be re-entered, as it cannot read the guest > backing page. The Idle HLT intercept feature allows the hypervisor to > intercept HLT execution only if there are no pending V_INTR and V_NMI > events. > > There are two use cases for the Idle HLT intercept feature: > - Secure VMs that wish to handle pending events securely without exiting > to the hypervisor on HLT (Secure AVIC). > - Optimization for all the VMs to avoid a wasteful VMEXIT during HLT > when there are pending events. > > On discovering the Idle HLT Intercept, the KVM hypervisor, > Sets the Idle HLT Intercept bit (bit (6), offset 0x14h) in the VMCB. > When the Idle HLT Intercept bit is set, HLT Intercept bit (bit (0), > offset 0xFh) should be cleared. > > Before entering the HLT state, the HLT instruction performs checks in > following order: > - The HLT intercept check, if set, it unconditionally triggers > SVM_EXIT_HLT (0x78). > - The Idle HLT intercept check, if set and there are no pending V_INTR > or V_NMI events, triggers SVM_EXIT_IDLE_HLT (0xA6). > > Details about the Idle HLT intercept feature can be found in AMD APM [1]. > > [1]: AMD64 Architecture Programmer's Manual Pub. 24593, April > 2024, Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT). > https://bugzilla.kernel.org/attachment.cgi?id=306250 > > Signed-off-by: Manali Shukla <Manali.Shukla@xxxxxxx> LGTM Reviewed-by: Nikunj A Dadhania <nikunj@xxxxxxx> > --- > arch/x86/include/asm/svm.h | 1 + > arch/x86/include/uapi/asm/svm.h | 2 ++ > arch/x86/kvm/svm/svm.c | 13 ++++++++++--- > 3 files changed, 13 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h > index 2b59b9951c90..992050cb83d0 100644 > --- a/arch/x86/include/asm/svm.h > +++ b/arch/x86/include/asm/svm.h > @@ -116,6 +116,7 @@ enum { > INTERCEPT_INVPCID, > INTERCEPT_MCOMMIT, > INTERCEPT_TLBSYNC, > + INTERCEPT_IDLE_HLT = 166, > }; > > > diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h > index 1814b413fd57..ec1321248dac 100644 > --- a/arch/x86/include/uapi/asm/svm.h > +++ b/arch/x86/include/uapi/asm/svm.h > @@ -95,6 +95,7 @@ > #define SVM_EXIT_CR14_WRITE_TRAP 0x09e > #define SVM_EXIT_CR15_WRITE_TRAP 0x09f > #define SVM_EXIT_INVPCID 0x0a2 > +#define SVM_EXIT_IDLE_HLT 0x0a6 > #define SVM_EXIT_NPF 0x400 > #define SVM_EXIT_AVIC_INCOMPLETE_IPI 0x401 > #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS 0x402 > @@ -224,6 +225,7 @@ > { SVM_EXIT_CR4_WRITE_TRAP, "write_cr4_trap" }, \ > { SVM_EXIT_CR8_WRITE_TRAP, "write_cr8_trap" }, \ > { SVM_EXIT_INVPCID, "invpcid" }, \ > + { SVM_EXIT_IDLE_HLT, "idle-halt" }, \ > { SVM_EXIT_NPF, "npf" }, \ > { SVM_EXIT_AVIC_INCOMPLETE_IPI, "avic_incomplete_ipi" }, \ > { SVM_EXIT_AVIC_UNACCELERATED_ACCESS, "avic_unaccelerated_access" }, \ > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 78daedf6697b..36f307e71d5d 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -1296,8 +1296,12 @@ static void init_vmcb(struct kvm_vcpu *vcpu) > svm_set_intercept(svm, INTERCEPT_MWAIT); > } > > - if (!kvm_hlt_in_guest(vcpu->kvm)) > - svm_set_intercept(svm, INTERCEPT_HLT); > + if (!kvm_hlt_in_guest(vcpu->kvm)) { > + if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT)) > + svm_set_intercept(svm, INTERCEPT_IDLE_HLT); > + else > + svm_set_intercept(svm, INTERCEPT_HLT); > + } > > control->iopm_base_pa = iopm_base; > control->msrpm_base_pa = __sme_set(__pa(svm->msrpm)); > @@ -3341,6 +3345,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = { > [SVM_EXIT_CR4_WRITE_TRAP] = cr_trap, > [SVM_EXIT_CR8_WRITE_TRAP] = cr_trap, > [SVM_EXIT_INVPCID] = invpcid_interception, > + [SVM_EXIT_IDLE_HLT] = kvm_emulate_halt, > [SVM_EXIT_NPF] = npf_interception, > [SVM_EXIT_RSM] = rsm_interception, > [SVM_EXIT_AVIC_INCOMPLETE_IPI] = avic_incomplete_ipi_interception, > @@ -3503,7 +3508,7 @@ int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code) > return interrupt_window_interception(vcpu); > else if (exit_code == SVM_EXIT_INTR) > return intr_interception(vcpu); > - else if (exit_code == SVM_EXIT_HLT) > + else if (exit_code == SVM_EXIT_HLT || exit_code == SVM_EXIT_IDLE_HLT) > return kvm_emulate_halt(vcpu); > else if (exit_code == SVM_EXIT_NPF) > return npf_interception(vcpu); > @@ -5224,6 +5229,8 @@ static __init void svm_set_cpu_caps(void) > if (vnmi) > kvm_cpu_cap_set(X86_FEATURE_VNMI); > > + kvm_cpu_cap_check_and_set(X86_FEATURE_IDLE_HLT); > + > /* Nested VM can receive #VMEXIT instead of triggering #GP */ > kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); > } > -- > 2.34.1