On Tue, Feb 28, 2023 at 11:52:59AM +0000, "Huang, Kai" <kai.huang@xxxxxxxxx> wrote: > On Tue, 2023-02-28 at 03:06 -0800, Isaku Yamahata wrote: > > > > + if (!e) > > > > + return -ENOMEM; > > > > + *e = (struct kvm_cpuid_entry2) { > > > > + .function = 1, /* Features for X2APIC */ > > > > + .index = 0, > > > > + .eax = 0, > > > > + .ebx = 0, > > > > + .ecx = 1ULL << 21, /* X2APIC */ > > > > + .edx = 0, > > > > + }; > > > > + vcpu->arch.cpuid_entries = e; > > > > + vcpu->arch.cpuid_nent = 1; > > > > > > As mentioned above, why doing it here? Won't be this be overwritten later in > > > KVM_SET_CPUID2? > > > > Yes, user space VMM can overwrite cpuid[0x1] and APIC base MSR. But it > > doesn't > > matter because it's a bug of user space VMM. user space VMM has to keep the > > consistency of cpuid and MSRs. > > Because TDX module virtualizes cpuid[0x1].x2apic to fixed 1, KVM value doesn't > > matter after vcpu creation. > > Because KVM virtualizes APIC base as read only to guest, cpuid[0x1].x2apic > > doesn't matter after vcpu creation as long as user space VMM keeps KVM APIC > > BASE > > value. > > > > Contrary, can we depend on userspace VMM to set x2APIC in CPUID, but not do this > in KVM? If userspace doesn't do it, we treat it as userspace's bug. > > Plus, userspace anyway needs to set x2APIC in CPUID regardless whether you have > done above here, correct? > > I don't see the point of doing above in KVM because you are neither enforcing > anything in KVM, nor you are reducing effort of userspace. Good idea. I can drop cpuid part from tdx_vcpu_create() and apic base part from tdx_vcpu_reset(). It needs to modify tdx_has_emulated_msr() to allow user space VMM to update APIC BASE MSR. diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e045a8132639..46c82ce3ef46 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2624,7 +2624,14 @@ int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { - if (tdx_has_emulated_msr(msr->index, true)) + if (tdx_has_emulated_msr(msr->index, true) || + /* + * user space VMM should explicitly set to X2APIC mode as initial + * value that is deviated from the conventional case. + */ + (msr->host_initiated && msr->index == MSR_IA32_APICBASE && + (msr->data & ~MSR_IA32_APICBASE_BSP) == + (APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC))) return kvm_set_msr_common(vcpu, msr); return 1; } Just FYI, qemu needs the following change. --- a/target/i386/kvm/tdx.c +++ b/target/i386/kvm/tdx.c @@ -274,6 +274,11 @@ bool is_tdx_vm(void) return !!tdx_guest; } +bool is_tdx_vm_finalized(void) +{ + return !!tdx_guest && tdx_guest->parent_obj.ready; +} + static inline uint32_t host_cpuid_reg(uint32_t function, uint32_t index, int reg) { @@ -875,10 +880,20 @@ static void tdx_post_init_vcpus(void) TdxFirmwareEntry *hob; CPUState *cpu; int r; + uint64_t apic_base; hob = tdx_get_hob_entry(tdx_guest); CPU_FOREACH(cpu) { - apic_force_x2apic(X86_CPU(cpu)->apic_state); + X86CPU *x86_cpu = X86_CPU(cpu); + + apic_force_x2apic(x86_cpu->apic_state); + + apic_base = APIC_DEFAULT_ADDRESS | MSR_IA32_APICBASE_ENABLE | + MSR_IA32_APICBASE_EXTD; + if (cpu_is_bsp(x86_cpu)) + apic_base |= MSR_IA32_APICBASE_BSP; + cpu_set_apic_base(x86_cpu->apic_state, apic_base); + kvm_put_apicbase(x86_cpu, apic_base); r = tdx_vcpu_ioctl(cpu, KVM_TDX_INIT_VCPU, 0, (void *)hob->address); if (r < 0) { diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h index 5cc0f730afa6..f77689464738 100644 --- a/target/i386/kvm/tdx.h +++ b/target/i386/kvm/tdx.h @@ -59,8 +59,10 @@ typedef struct TdxGuest { #ifdef CONFIG_TDX bool is_tdx_vm(void); +bool is_tdx_vm_finalized(void); #else #define is_tdx_vm() 0 +#define is_tdx_vm_finalized() false; #endif /* CONFIG_TDX */ int tdx_kvm_init(MachineState *ms, Error **errp); -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>