Re: [RFC PATCH v2 3/5] KVM: SVM: Enable Bus lock threshold exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tom,

Thank you for reviewing my patches.

On 10/1/2024 7:13 PM, Tom Lendacky wrote:
> On 10/1/24 01:34, Manali Shukla wrote:
>> From: Nikunj A Dadhania <nikunj@xxxxxxx>
>>
>> Virtual machines can exploit bus locks to degrade the performance of
>> the system. Bus locks can be caused by Non-WB(Write back) and
>> misaligned locked RMW (Read-modify-Write) instructions and require
>> systemwide synchronization among all processors which can result into
>> significant performance penalties.
>>
>> To address this issue, the Bus Lock Threshold feature is introduced to
>> provide ability to hypervisor to restrict guests' capability of
>> initiating mulitple buslocks, thereby preventing system slowdowns.
>>
>> Support for the buslock threshold is indicated via CPUID function
>> 0x8000000A_EDX[29].
>>
>> On the processors that support the Bus Lock Threshold feature, the
>> VMCB provides a Bus Lock Threshold enable bit and an unsigned 16-bit
>> Bus Lock threshold count.
>>
>> VMCB intercept bit
>> VMCB Offset	Bits	Function
>> 14h	        5	Intercept bus lock operations
>>                         (occurs after guest instruction finishes)
>>
>> Bus lock threshold count
>> VMCB Offset	Bits	Function
>> 120h	        15:0	Bus lock counter
>>
>> When a VMRUN instruction is executed, the bus lock threshold count is
>> loaded into an internal count register. Before the processor executes
>> a bus lock in the guest, it checks the value of this register:
>>
>>  - If the value is greater than '0', the processor successfully
>>    executes the bus lock and decrements the count.
>>
>>  - If the value is '0', the bus lock is not executed, and a #VMEXIT to
>>    the VMM is taken.
>>
>> The bus lock threshold #VMEXIT is reported to the VMM with the VMEXIT
>> code A5h, SVM_EXIT_BUS_LOCK.
>>
>> When SVM_EXIT_BUS_LOCK occurs, the value of the current Bus Lock
>> Threshold counter, which is '0', is loaded into the VMCB. This value
>> must be reset to a default greater than '0' in bus_lock_exit handler
>> in hypervisor, because the behavior of SVM_EXIT_BUS_LOCK is fault
>> like, so the bus lock exit to userspace  happens with the RIP pointing
>> to the offending instruction (which generates buslocks).
>>
>> So, if the value of the Bus Lock Threshold counter remains '0', the
>> guest continuously exits with SVM_EXIT_BUS_LOCK.
>>
>> The generated SVM_EXIT_BUS_LOCK in kvm will exit to user space by
>> setting the KVM_RUN_BUS_LOCK flag in vcpu->run->flags, indicating to
>> the user space that a bus lock has been detected in the guest.
>>
>> Use the already available KVM capability KVM_CAP_X86_BUS_LOCK_EXIT to
>> enable the feature. This feature is disabled by default, it can be
>> enabled explicitly from user space.
>>
>> More details about the Bus Lock Threshold feature can be found in AMD
>> APM [1].
> 
> You should mention in the commit message that this implementation is set
> up to intercept every guest bus lock. The initial value of the threshold
> is 0 in the VMCB, so the very first bus lock will exit, set the
> threshold to 1 (so that the offending instruction can make forward
> progress) but then the value is at 0 again so the next bus lock will exit.

Sure. I will add to V3.

> 
>>
>> [1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
>>      Vol 2, 15.14.5 Bus Lock Threshold.
>>      https://bugzilla.kernel.org/attachment.cgi?id=306250
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@xxxxxxx>
>> Co-developed-by: Manali Shukla <manali.shukla@xxxxxxx>
>> Signed-off-by: Manali Shukla <manali.shukla@xxxxxxx>
>> ---
>>  arch/x86/include/asm/svm.h      |  5 ++++-
>>  arch/x86/include/uapi/asm/svm.h |  2 ++
>>  arch/x86/kvm/svm/nested.c       | 12 ++++++++++++
>>  arch/x86/kvm/svm/svm.c          | 29 +++++++++++++++++++++++++++++
>>  4 files changed, 47 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index f0dea3750ca9..bad9723f40e1 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -116,6 +116,7 @@ enum {
>>  	INTERCEPT_INVPCID,
>>  	INTERCEPT_MCOMMIT,
>>  	INTERCEPT_TLBSYNC,
>> +	INTERCEPT_BUSLOCK,
>>  };
>>  
>>  
>> @@ -158,7 +159,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>>  	u64 avic_physical_id;	/* Offset 0xf8 */
>>  	u8 reserved_7[8];
>>  	u64 vmsa_pa;		/* Used for an SEV-ES guest */
>> -	u8 reserved_8[720];
>> +	u8 reserved_8[16];
>> +	u16 bus_lock_counter;	/* Offset 0x120 */
>> +	u8 reserved_9[702];
>>  	/*
>>  	 * Offset 0x3e0, 32 bytes reserved
>>  	 * for use by hypervisor/software.
>> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
>> index 1814b413fd57..abf6aed88cee 100644
>> --- a/arch/x86/include/uapi/asm/svm.h
>> +++ b/arch/x86/include/uapi/asm/svm.h
>> @@ -95,6 +95,7 @@
>>  #define SVM_EXIT_CR14_WRITE_TRAP		0x09e
>>  #define SVM_EXIT_CR15_WRITE_TRAP		0x09f
>>  #define SVM_EXIT_INVPCID       0x0a2
>> +#define SVM_EXIT_BUS_LOCK			0x0a5
>>  #define SVM_EXIT_NPF           0x400
>>  #define SVM_EXIT_AVIC_INCOMPLETE_IPI		0x401
>>  #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS	0x402
>> @@ -224,6 +225,7 @@
>>  	{ SVM_EXIT_CR4_WRITE_TRAP,	"write_cr4_trap" }, \
>>  	{ SVM_EXIT_CR8_WRITE_TRAP,	"write_cr8_trap" }, \
>>  	{ SVM_EXIT_INVPCID,     "invpcid" }, \
>> +	{ SVM_EXIT_BUS_LOCK,     "buslock" }, \
>>  	{ SVM_EXIT_NPF,         "npf" }, \
>>  	{ SVM_EXIT_AVIC_INCOMPLETE_IPI,		"avic_incomplete_ipi" }, \
>>  	{ SVM_EXIT_AVIC_UNACCELERATED_ACCESS,   "avic_unaccelerated_access" }, \
>> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
>> index 6f704c1037e5..670092d31f77 100644
>> --- a/arch/x86/kvm/svm/nested.c
>> +++ b/arch/x86/kvm/svm/nested.c
>> @@ -669,6 +669,12 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>>  	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
>>  	vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
>>  
>> +	/*
>> +	 * The bus lock threshold count should keep running across nested
>> +	 * transitions.
>> +	 * Copied the buslock threshold count from vmcb01 to vmcb02.
> 
> No need to start this part of the comment on a separate line. Also,
> s/Copied/Copy/ (ditto below).
> 

Ack.

>> +	 */
>> +	vmcb02->control.bus_lock_counter = vmcb01->control.bus_lock_counter;
>>  	/* Done at vmrun: asid.  */
>>  
>>  	/* Also overwritten later if necessary.  */
>> @@ -1035,6 +1041,12 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>>  
>>  	}
>>  
>> +	/*
>> +	 * The bus lock threshold count should keep running across nested
>> +	 * transitions.
>> +	 * Copied the buslock threshold count from vmcb02 to vmcb01.
>> +	 */
>> +	vmcb01->control.bus_lock_counter = vmcb02->control.bus_lock_counter;
>>  	nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
>>  
>>  	svm_switch_vmcb(svm, &svm->vmcb01);
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index 5ab2c92c7331..41c773a40f99 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -1372,6 +1372,11 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>>  		svm->vmcb->control.int_ctl |= V_GIF_ENABLE_MASK;
>>  	}
>>  
>> +	if (vcpu->kvm->arch.bus_lock_detection_enabled)
>> +		svm_set_intercept(svm, INTERCEPT_BUSLOCK);
>> +	else
>> +		svm_clr_intercept(svm, INTERCEPT_BUSLOCK);
> 
> Is the else path really needed?
> 
> Thanks,
> Tom
Correct. It is not needed. I will remove from V3.

> 
>> +
>>  	if (sev_guest(vcpu->kvm))
>>  		sev_init_vmcb(svm);
>>  
>> @@ -3283,6 +3288,24 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
>>  	return kvm_handle_invpcid(vcpu, type, gva);
>>  }
>>  
>> +static int bus_lock_exit(struct kvm_vcpu *vcpu)
>> +{
>> +	struct vcpu_svm *svm = to_svm(vcpu);
>> +
>> +	vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK;
>> +	vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
>> +
>> +	/*
>> +	 * Reload the counter with value greater than '0'.
>> +	 * The bus lock exit on SVM happens with RIP pointing to the guilty
>> +	 * instruction. So, reloading the value of bus_lock_counter to '0'
>> +	 * results into generating continuous bus lock exits.
>> +	 */
>> +	svm->vmcb->control.bus_lock_counter = 1;
>> +
>> +	return 0;
>> +}
>> +
>>  static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>>  	[SVM_EXIT_READ_CR0]			= cr_interception,
>>  	[SVM_EXIT_READ_CR3]			= cr_interception,
>> @@ -3350,6 +3373,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>>  	[SVM_EXIT_CR4_WRITE_TRAP]		= cr_trap,
>>  	[SVM_EXIT_CR8_WRITE_TRAP]		= cr_trap,
>>  	[SVM_EXIT_INVPCID]                      = invpcid_interception,
>> +	[SVM_EXIT_BUS_LOCK]			= bus_lock_exit,
>>  	[SVM_EXIT_NPF]				= npf_interception,
>>  	[SVM_EXIT_RSM]                          = rsm_interception,
>>  	[SVM_EXIT_AVIC_INCOMPLETE_IPI]		= avic_incomplete_ipi_interception,
>> @@ -5212,6 +5236,11 @@ static __init void svm_set_cpu_caps(void)
>>  		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>>  	}
>>  
>> +	if (cpu_feature_enabled(X86_VIRT_FEATURE_BUS_LOCK_THRESHOLD)) {
>> +		pr_info("Bus Lock Threashold supported\n");
>> +		kvm_caps.has_bus_lock_exit = true;
>> +	}
>> +
>>  	/* CPUID 0x80000008 */
>>  	if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
>>  	    boot_cpu_has(X86_FEATURE_AMD_SSBD))

-Manali




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux