У вт, 2023-05-30 у 13:34 -0700, Jim Mattson пише: > On Tue, May 30, 2023 at 1:10 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote: > > On Mon, May 29, 2023 at 6:44 AM Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote: > > > У пн, 2023-05-29 у 14:58 +0200, jwarren@xxxxxxxxxxxx пише: > > > > Hello, > > > > Since kernel 5.16 users can't start VMware VMs when it is nested under KVM on AMD CPUs. > > > > > > > > User reports are here: > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008583 > > > > https://forums.unraid.net/topic/128868-vmware-7x-will-not-start-any-vms-under-unraid-6110/ > > > > > > > > I've pinpointed it to commit 174a921b6975ef959dd82ee9e8844067a62e3ec1 (appeared in 5.16rc1) > > > > "nSVM: Check for reserved encodings of TLB_CONTROL in nested VMCB" > > > > > > > > I've confirmed that VMware errors out when it checks for TLB_CONTROL_FLUSH_ASID support and gets a 'false' answer. > > > > > > > > First revisions of the patch in question had some support for TLB_CONTROL_FLUSH_ASID, but it was removed: > > > > https://lore.kernel.org/kvm/f7c2d5f5-3560-8666-90be-3605220cb93c@xxxxxxxxxx/ > > > > > > > > I don't know what would be the best case here, maybe put a quirk there, so it doesn't break "userspace". > > > > Committer's email is dead, so I'm writing here. > > > > > > > > > > I have to say that I know about this for long time, because some time ago I used to play with VMware player in a > > > VM on AMD on my spare time, on weekends > > > (just doing various crazy things with double nesting, running win98 nested, vfio stuff, etc, etc). > > > > > > I didn't report it because its a bug in VMWARE - they set a bit in the tlb_control without checking CPUID's FLUSHBYASID > > > which states that KVM doesn't support setting this bit. > > > > I am pretty sure that bit 1 is supposed to be ignored on hardware > > without FlushByASID, but I'll have to see if I can dig up an old APM > > to verify that. > > I couldn't find an APM that old, but even today's APM does not specify > that any checks are performed on the TLB_CONTROL field by VMRUN. > > While Intel likes to fail VM-entry for illegal VMCS state, AMD prefers > to massage the VMCB to render any illegal VMCB state legal. For > example, rather than fail VM-entry for a non-canonical address, AMD is > inclined to drop the high bits and sign-extend the low bits, so that > the address is canonical. > > I'm willing to bet that modern CPUs continue to ignore the TLB_CONTROL > bits that were noted "reserved" in version 3.22 of the manual, and > that Krish simply manufactured the checks in commit 174a921b6975 > ("nSVM: Check for reserved encodings of TLB_CONTROL in nested VMCB"), > without cause. > > > > Supporting FLUSHBYASID would fix this, and make nesting faster too, > > > but it is far from a trivial job. > > > > > > I hope that I will find time to do this soon. > > > > > > Best regards, > > > Maxim Levitsky > > > > > > Yup... After applying this horrible hack to KVM, the VM still boots just fine on bare metal. diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index e7c7379d6ac7b0..2e45c1b747104a 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -170,10 +170,10 @@ struct __attribute__ ((__packed__)) vmcb_control_area { }; -#define TLB_CONTROL_DO_NOTHING 0 -#define TLB_CONTROL_FLUSH_ALL_ASID 1 -#define TLB_CONTROL_FLUSH_ASID 3 -#define TLB_CONTROL_FLUSH_ASID_LOCAL 7 +#define TLB_CONTROL_DO_NOTHING 0xF0 +#define TLB_CONTROL_FLUSH_ALL_ASID 0xF1 +#define TLB_CONTROL_FLUSH_ASID 0xF3 +#define TLB_CONTROL_FLUSH_ASID_LOCAL 0xF7 #define V_TPR_MASK 0x0f Shall we revert the offending patch then? Best regards, Maxim Levitsky