> On Jan 12, 2021, at 7:17 AM, Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote: > > On Tue, 2021-01-12 at 07:11 -0800, Andy Lutomirski wrote: >>>> On Jan 12, 2021, at 4:15 AM, Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote: >>> >>> Wei Huang <wei.huang2@xxxxxxx> writes: >>> >>>> From: Bandan Das <bsd@xxxxxxxxxx> >>>> >>>> While running VM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD >>>> CPUs check EAX against reserved memory regions (e.g. SMM memory on host) >>>> before checking VMCB's instruction intercept. If EAX falls into such >>>> memory areas, #GP is triggered before VMEXIT. This causes problem under >>>> nested virtualization. To solve this problem, KVM needs to trap #GP and >>>> check the instructions triggering #GP. For VM execution instructions, >>>> KVM emulates these instructions; otherwise it re-injects #GP back to >>>> guest VMs. >>>> >>>> Signed-off-by: Bandan Das <bsd@xxxxxxxxxx> >>>> Co-developed-by: Wei Huang <wei.huang2@xxxxxxx> >>>> Signed-off-by: Wei Huang <wei.huang2@xxxxxxx> >>>> --- >>>> arch/x86/include/asm/kvm_host.h | 8 +- >>>> arch/x86/kvm/mmu.h | 1 + >>>> arch/x86/kvm/mmu/mmu.c | 7 ++ >>>> arch/x86/kvm/svm/svm.c | 157 +++++++++++++++++++------------- >>>> arch/x86/kvm/svm/svm.h | 8 ++ >>>> arch/x86/kvm/vmx/vmx.c | 2 +- >>>> arch/x86/kvm/x86.c | 37 +++++++- >>>> 7 files changed, 146 insertions(+), 74 deletions(-) >>>> >>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >>>> index 3d6616f6f6ef..0ddc309f5a14 100644 >>>> --- a/arch/x86/include/asm/kvm_host.h >>>> +++ b/arch/x86/include/asm/kvm_host.h >>>> @@ -1450,10 +1450,12 @@ extern u64 kvm_mce_cap_supported; >>>> * due to an intercepted #UD (see EMULTYPE_TRAP_UD). >>>> * Used to test the full emulator from userspace. >>>> * >>>> - * EMULTYPE_VMWARE_GP - Set when emulating an intercepted #GP for VMware >>>> + * EMULTYPE_PARAVIRT_GP - Set when emulating an intercepted #GP for VMware >>>> * backdoor emulation, which is opt in via module param. >>>> * VMware backoor emulation handles select instructions >>>> - * and reinjects the #GP for all other cases. >>>> + * and reinjects #GP for all other cases. This also >>>> + * handles other cases where #GP condition needs to be >>>> + * handled and emulated appropriately >>>> * >>>> * EMULTYPE_PF - Set when emulating MMIO by way of an intercepted #PF, in which >>>> * case the CR2/GPA value pass on the stack is valid. >>>> @@ -1463,7 +1465,7 @@ extern u64 kvm_mce_cap_supported; >>>> #define EMULTYPE_SKIP (1 << 2) >>>> #define EMULTYPE_ALLOW_RETRY_PF (1 << 3) >>>> #define EMULTYPE_TRAP_UD_FORCED (1 << 4) >>>> -#define EMULTYPE_VMWARE_GP (1 << 5) >>>> +#define EMULTYPE_PARAVIRT_GP (1 << 5) >>>> #define EMULTYPE_PF (1 << 6) >>>> >>>> int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type); >>>> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h >>>> index 581925e476d6..1a2fff4e7140 100644 >>>> --- a/arch/x86/kvm/mmu.h >>>> +++ b/arch/x86/kvm/mmu.h >>>> @@ -219,5 +219,6 @@ int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); >>>> >>>> int kvm_mmu_post_init_vm(struct kvm *kvm); >>>> void kvm_mmu_pre_destroy_vm(struct kvm *kvm); >>>> +bool kvm_is_host_reserved_region(u64 gpa); >>> >>> Just a suggestion: "kvm_gpa_in_host_reserved()" maybe? >>> >>>> #endif >>>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c >>>> index 6d16481aa29d..c5c4aaf01a1a 100644 >>>> --- a/arch/x86/kvm/mmu/mmu.c >>>> +++ b/arch/x86/kvm/mmu/mmu.c >>>> @@ -50,6 +50,7 @@ >>>> #include <asm/io.h> >>>> #include <asm/vmx.h> >>>> #include <asm/kvm_page_track.h> >>>> +#include <asm/e820/api.h> >>>> #include "trace.h" >>>> >>>> extern bool itlb_multihit_kvm_mitigation; >>>> @@ -5675,6 +5676,12 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm, >>>> } >>>> EXPORT_SYMBOL_GPL(kvm_mmu_slot_set_dirty); >>>> >>>> +bool kvm_is_host_reserved_region(u64 gpa) >>>> +{ >>>> + return e820__mbapped_raw_any(gpa-1, gpa+1, E820_TYPE_RESERVED); >>>> +} >>> >>> While _e820__mapped_any()'s doc says '.. checks if any part of the >>> range <start,end> is mapped ..' it seems to me that the real check is >>> [start, end) so we should use 'gpa' instead of 'gpa-1', no? >> >> Why do you need to check GPA at all? >> > To reduce the scope of the workaround. > > The errata only happens when you use one of SVM instructions > in the guest with EAX that happens to be inside one > of the host reserved memory regions (for example SMM). This code reduces the scope of the workaround at the cost of increasing the complexity of the workaround and adding a nonsensical coupling between KVM and host details and adding an export that really doesn’t deserve to be exported. Is there an actual concrete benefit to this check?