Re: [PATCH v3 2/5] KVM: x86: invert KVM_HYPERCALL to default to VMMCALL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Sean,

> On Aug 20, 2021, at 2:15 AM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> 
> Preferred shortlog prefix for KVM guest changes is "x86/kvm".  "KVM: x86" is for
> host changes.
> 
>> On Tue, Jun 08, 2021, Ashish Kalra wrote:
>> From: Ashish Kalra <ashish.kalra@xxxxxxx>
>> 
>> KVM hypercall framework relies on alternative framework to patch the
>> VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
>> apply_alternative() is called then it defaults to VMCALL. The approach
>> works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
>> will be able to decode the instruction and do the right things. But
>> when SEV is active, guest memory is encrypted with guest key and
>> hypervisor will not be able to decode the instruction bytes.
>> 
>> So invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to VMMCALL
>> and opt into VMCALL.
> 
> The changelog needs to explain why SEV hypercalls need to be made before
> apply_alternative(), why it's ok to make Intel CPUs take #UDs on the unknown
> VMMCALL, and why this is not creating the same conundrum for TDX.

I think it makes more sense to stick to the original approach/patch, i.e., introducing a new private hypercall interface like kvm_sev_hypercall3() and let early paravirtualized kernel code invoke this private hypercall interface wherever required.

This helps avoiding Intel CPUs taking unnecessary #UDs and also avoid using hacks as below.

TDX code can introduce similar private hypercall interface for their early para virtualized kernel code if required.

> 
> Actually, I don't think making Intel CPUs take #UDs is acceptable.  This patch
> breaks Linux on upstream KVM on Intel due a bug in upstream KVM.  KVM attempts
> to patch the "wrong" hypercall to the "right" hypercall, but stupidly does so
> via an emulated write.  I.e. KVM honors the guest page table permissions and
> injects a !WRITABLE #PF on the VMMCALL RIP if the kernel code is mapped RX.
> 
> In other words, trusting the VMM to not screw up the #UD is a bad idea.  This also
> makes documenting the "why does SEV need super early hypercalls" extra important.
> 

Makes sense.

Thanks,
Ashish

> This patch doesn't work because X86_FEATURE_VMCALL is a synthetic flag and is
> only set by VMware paravirt code, which is why the patching doesn't happen as
> would be expected.  The obvious solution would be to manually set X86_FEATURE_VMCALL
> where appropriate, but given that defaulting to VMCALL has worked for years,
> defaulting to VMMCALL makes me nervous, e.g. even if we splatter X86_FEATURE_VMCALL
> into Intel, Centaur, and Zhaoxin, there's a possibility we'll break existing VMs
> that run on hypervisors that do something weird with the vendor string.
> 
> Rather than look for X86_FEATURE_VMCALL, I think it makes sense to have this be
> a "pure" inversion, i.e. patch in VMCALL if VMMCALL is not supported, as opposed
> to patching in VMCALL if VMCALL is supproted.
> 
> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> index 69299878b200..61641e69cfda 100644
> --- a/arch/x86/include/asm/kvm_para.h
> +++ b/arch/x86/include/asm/kvm_para.h
> @@ -17,7 +17,7 @@ static inline bool kvm_check_and_clear_guest_paused(void)
> #endif /* CONFIG_KVM_GUEST */
> 
> #define KVM_HYPERCALL \
> -        ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL)
> +        ALTERNATIVE("vmmcall", "vmcall", ALT_NOT(X86_FEATURE_VMMCALL))
> 
> /* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall
>  * instruction.  The hypervisor may replace it with something else but only the
> 
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>> Cc: Joerg Roedel <joro@xxxxxxxxxx>
>> Cc: Borislav Petkov <bp@xxxxxxx>
>> Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
>> Cc: x86@xxxxxxxxxx
>> Cc: kvm@xxxxxxxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
> 
> Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> 
>> Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx>
> 
> Is Brijesh the author?  Co-developed-by for a one-line change would be odd...
> 
>> Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
>> ---
>> arch/x86/include/asm/kvm_para.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
>> index 69299878b200..0267bebb0b0f 100644
>> --- a/arch/x86/include/asm/kvm_para.h
>> +++ b/arch/x86/include/asm/kvm_para.h
>> @@ -17,7 +17,7 @@ static inline bool kvm_check_and_clear_guest_paused(void)
>> #endif /* CONFIG_KVM_GUEST */
>> 
>> #define KVM_HYPERCALL \
>> -        ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL)
>> +    ALTERNATIVE("vmmcall", "vmcall", X86_FEATURE_VMCALL)
>> 
>> /* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall
>>  * instruction.  The hypervisor may replace it with something else but only the
>> -- 
>> 2.17.1
>> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux