On Sun, Jan 5, 2025 at 5:46 PM Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> wrote: > > > > > On 1/4/2025 5:59 AM, Vishal Annapurve wrote: > > On Sun, Dec 8, 2024 at 5:11 PM Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> wrote: > >> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > >> > >> Inhibit APICv for TDX guest in KVM since TDX doesn't support APICv accesses > >> from host VMM. > >> > >> Follow how SEV inhibits APICv. I.e, define a new inhibit reason for TDX, set > >> it on TD initialization, and add the flag to kvm_x86_ops.required_apicv_inhibits. > >> > >> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > >> Signed-off-by: Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> > >> --- > >> TDX interrupts breakout: > >> - Removed WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm)) in > >> tdx_td_vcpu_init(). (Rick) > >> - Change APICV -> APICv in changelog for consistency. > >> - Split the changelog to 2 paragraphs. > >> --- > >> arch/x86/include/asm/kvm_host.h | 12 +++++++++++- > >> arch/x86/kvm/vmx/main.c | 3 ++- > >> arch/x86/kvm/vmx/tdx.c | 3 +++ > >> 3 files changed, 16 insertions(+), 2 deletions(-) > >> > >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > >> index 32c7d58a5d68..df535f08e004 100644 > >> --- a/arch/x86/include/asm/kvm_host.h > >> +++ b/arch/x86/include/asm/kvm_host.h > >> @@ -1281,6 +1281,15 @@ enum kvm_apicv_inhibit { > >> */ > >> APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED, > >> > >> + /*********************************************************/ > >> + /* INHIBITs that are relevant only to the Intel's APICv. */ > >> + /*********************************************************/ > >> + > >> + /* > >> + * APICv is disabled because TDX doesn't support it. > >> + */ > >> + APICV_INHIBIT_REASON_TDX, > >> + > >> NR_APICV_INHIBIT_REASONS, > >> }; > >> > >> @@ -1299,7 +1308,8 @@ enum kvm_apicv_inhibit { > >> __APICV_INHIBIT_REASON(IRQWIN), \ > >> __APICV_INHIBIT_REASON(PIT_REINJ), \ > >> __APICV_INHIBIT_REASON(SEV), \ > >> - __APICV_INHIBIT_REASON(LOGICAL_ID_ALIASED) > >> + __APICV_INHIBIT_REASON(LOGICAL_ID_ALIASED), \ > >> + __APICV_INHIBIT_REASON(TDX) > >> > >> struct kvm_arch { > >> unsigned long n_used_mmu_pages; > >> diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c > >> index 7f933f821188..13a0ab0a520c 100644 > >> --- a/arch/x86/kvm/vmx/main.c > >> +++ b/arch/x86/kvm/vmx/main.c > >> @@ -445,7 +445,8 @@ static int vt_gmem_private_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn) > >> BIT(APICV_INHIBIT_REASON_BLOCKIRQ) | \ > >> BIT(APICV_INHIBIT_REASON_PHYSICAL_ID_ALIASED) | \ > >> BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) | \ > >> - BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED)) > >> + BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED) | \ > >> + BIT(APICV_INHIBIT_REASON_TDX)) > >> > >> struct kvm_x86_ops vt_x86_ops __initdata = { > >> .name = KBUILD_MODNAME, > >> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > >> index b0f525069ebd..b51d2416acfb 100644 > >> --- a/arch/x86/kvm/vmx/tdx.c > >> +++ b/arch/x86/kvm/vmx/tdx.c > >> @@ -2143,6 +2143,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, > >> goto teardown; > >> } > >> > >> + kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_TDX); > >> + > >> return 0; > >> > >> /* > >> @@ -2528,6 +2530,7 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u64 vcpu_rcx) > >> return -EIO; > >> } > >> > >> + vcpu->arch.apic->apicv_active = false; > > With this setting, apic_timer_expired[1] will always cause timer > > interrupts to be pending without injecting them right away. Injecting > > it after VM exit [2] could cause unbounded delays to timer interrupt > > injection. > > When apic->apicv_active is false, it will fallback to increasing the > apic->lapic_timer.pending and request KVM_REQ_UNBLOCK. > If apic_timer_expired() is called from timer function, the target vCPU > will be kicked out immediately. > So there is no unbounded delay to timer interrupt injection. Ack. Though, wouldn't it be faster to just post timer interrupts right away without causing vcpu exit? Another scenario I was thinking of was hrtimer expiry during vcpu exit being handled in KVM/userspace, which will cause timer interrupt injection after the next exit [1] delaying timer interrupt to guest. This scenario is not specific to TDX VMs though. [1] https://git.kernel.org/pub/scm/virt/kvm/kvm.git/tree/arch/x86/kvm/x86.c?h=kvm-coco-queue#n11263 > > In a previous PUCK session, Sean suggested apic->apicv_active should be > set to true to align with the hardware setting because TDX module always > enables apicv for TDX guests. > Will send a write up about changing apicv to active later. > > > > > [1] https://git.kernel.org/pub/scm/virt/kvm/kvm.git/tree/arch/x86/kvm/lapic.c?h=kvm-coco-queue#n1922 > > [2] https://git.kernel.org/pub/scm/virt/kvm/kvm.git/tree/arch/x86/kvm/x86.c?h=kvm-coco-queue#n11263 > > > >> vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; > >> > >> return 0; > >> -- > >> 2.46.0 > >> > >> >