Hello Sean, > On Aug 20, 2021, at 2:15 AM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Preferred shortlog prefix for KVM guest changes is "x86/kvm". "KVM: x86" is for > host changes. > >> On Tue, Jun 08, 2021, Ashish Kalra wrote: >> From: Ashish Kalra <ashish.kalra@xxxxxxx> >> >> KVM hypercall framework relies on alternative framework to patch the >> VMCALL -> VMMCALL on AMD platform. If a hypercall is made before >> apply_alternative() is called then it defaults to VMCALL. The approach >> works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor >> will be able to decode the instruction and do the right things. But >> when SEV is active, guest memory is encrypted with guest key and >> hypervisor will not be able to decode the instruction bytes. >> >> So invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to VMMCALL >> and opt into VMCALL. > > The changelog needs to explain why SEV hypercalls need to be made before > apply_alternative(), why it's ok to make Intel CPUs take #UDs on the unknown > VMMCALL, and why this is not creating the same conundrum for TDX. I think it makes more sense to stick to the original approach/patch, i.e., introducing a new private hypercall interface like kvm_sev_hypercall3() and let early paravirtualized kernel code invoke this private hypercall interface wherever required. This helps avoiding Intel CPUs taking unnecessary #UDs and also avoid using hacks as below. TDX code can introduce similar private hypercall interface for their early para virtualized kernel code if required. > > Actually, I don't think making Intel CPUs take #UDs is acceptable. This patch > breaks Linux on upstream KVM on Intel due a bug in upstream KVM. KVM attempts > to patch the "wrong" hypercall to the "right" hypercall, but stupidly does so > via an emulated write. I.e. KVM honors the guest page table permissions and > injects a !WRITABLE #PF on the VMMCALL RIP if the kernel code is mapped RX. > > In other words, trusting the VMM to not screw up the #UD is a bad idea. This also > makes documenting the "why does SEV need super early hypercalls" extra important. > Makes sense. Thanks, Ashish > This patch doesn't work because X86_FEATURE_VMCALL is a synthetic flag and is > only set by VMware paravirt code, which is why the patching doesn't happen as > would be expected. The obvious solution would be to manually set X86_FEATURE_VMCALL > where appropriate, but given that defaulting to VMCALL has worked for years, > defaulting to VMMCALL makes me nervous, e.g. even if we splatter X86_FEATURE_VMCALL > into Intel, Centaur, and Zhaoxin, there's a possibility we'll break existing VMs > that run on hypervisors that do something weird with the vendor string. > > Rather than look for X86_FEATURE_VMCALL, I think it makes sense to have this be > a "pure" inversion, i.e. patch in VMCALL if VMMCALL is not supported, as opposed > to patching in VMCALL if VMCALL is supproted. > > diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h > index 69299878b200..61641e69cfda 100644 > --- a/arch/x86/include/asm/kvm_para.h > +++ b/arch/x86/include/asm/kvm_para.h > @@ -17,7 +17,7 @@ static inline bool kvm_check_and_clear_guest_paused(void) > #endif /* CONFIG_KVM_GUEST */ > > #define KVM_HYPERCALL \ > - ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL) > + ALTERNATIVE("vmmcall", "vmcall", ALT_NOT(X86_FEATURE_VMMCALL)) > > /* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall > * instruction. The hypervisor may replace it with something else but only the > >> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Joerg Roedel <joro@xxxxxxxxxx> >> Cc: Borislav Petkov <bp@xxxxxxx> >> Cc: Tom Lendacky <thomas.lendacky@xxxxxxx> >> Cc: x86@xxxxxxxxxx >> Cc: kvm@xxxxxxxxxxxxxxx >> Cc: linux-kernel@xxxxxxxxxxxxxxx > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > >> Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx> > > Is Brijesh the author? Co-developed-by for a one-line change would be odd... > >> Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx> >> --- >> arch/x86/include/asm/kvm_para.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h >> index 69299878b200..0267bebb0b0f 100644 >> --- a/arch/x86/include/asm/kvm_para.h >> +++ b/arch/x86/include/asm/kvm_para.h >> @@ -17,7 +17,7 @@ static inline bool kvm_check_and_clear_guest_paused(void) >> #endif /* CONFIG_KVM_GUEST */ >> >> #define KVM_HYPERCALL \ >> - ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL) >> + ALTERNATIVE("vmmcall", "vmcall", X86_FEATURE_VMCALL) >> >> /* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall >> * instruction. The hypervisor may replace it with something else but only the >> -- >> 2.17.1 >>