On Mon, Jun 17, 2024 at 05:07:56PM +0800, Binbin Wu wrote: > > > On 6/17/2024 4:07 PM, Yuan Yao wrote: > > On Mon, Feb 26, 2024 at 12:26:27AM -0800, isaku.yamahata@xxxxxxxxx wrote: > > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > > > This corresponds to VMX __vmx_complete_interrupts(). Because TDX > > > virtualize vAPIC, KVM only needs to care NMI injection. > > > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > Reviewed-by: Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> > > > --- > > > v19: > > > - move tdvps_management_check() to this patch > > > - typo: complete -> Complete in short log > > > --- > > > arch/x86/kvm/vmx/tdx.c | 10 ++++++++++ > > > arch/x86/kvm/vmx/tdx.h | 4 ++++ > > > 2 files changed, 14 insertions(+) > > > > > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > > > index 83dcaf5b6fbd..b8b168f74dfe 100644 > > > --- a/arch/x86/kvm/vmx/tdx.c > > > +++ b/arch/x86/kvm/vmx/tdx.c > > > @@ -535,6 +535,14 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) > > > */ > > > } > > > > > > +static void tdx_complete_interrupts(struct kvm_vcpu *vcpu) > > > +{ > > > + /* Avoid costly SEAMCALL if no nmi was injected */ > > > + if (vcpu->arch.nmi_injected) > > > + vcpu->arch.nmi_injected = td_management_read8(to_tdx(vcpu), > > > + TD_VCPU_PEND_NMI); > > > +} > > Looks this leads to NMI injection delay or even won't be > > reinjected if KVM_REQ_EVENT is not set on the target cpu > > when more than 1 NMIs are pending there. > > > > On normal VM, KVM uses NMI window vmexit for injection > > successful case to rasie the KVM_REQ_EVENT again for remain > > pending NMIs, see handle_nmi_window(). KVM also checks > > vectoring info after VMEXIT for case that the NMI is not > > injected successfully in this vmentry vmexit round, and > > raise KVM_REQ_EVENT to try again, see __vmx_complete_interrupts(). > > > > In TDX, consider there's no way to get vectoring info or > > handle nmi window vmexit, below checking should cover both > > scenarios for NMI injection: > > > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > > index e9c9a185bb7b..9edf446acd3b 100644 > > --- a/arch/x86/kvm/vmx/tdx.c > > +++ b/arch/x86/kvm/vmx/tdx.c > > @@ -835,9 +835,12 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) > > static void tdx_complete_interrupts(struct kvm_vcpu *vcpu) > > { > > /* Avoid costly SEAMCALL if no nmi was injected */ > > - if (vcpu->arch.nmi_injected) > > + if (vcpu->arch.nmi_injected) { > > vcpu->arch.nmi_injected = td_management_read8(to_tdx(vcpu), > > TD_VCPU_PEND_NMI); > > + if (vcpu->arch.nmi_injected || vcpu->arch.nmi_pending) > > + kvm_make_request(KVM_REQ_EVENT, vcpu); > > For nmi_injected, it should be OK because TD_VCPU_PEND_NMI is still set. > But for nmi_pending, it should be checked and raise event. Right, I just forgot the tdx module can do more than "hardware": diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e9c9a185bb7b..3530a4882efc 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -835,9 +835,16 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) static void tdx_complete_interrupts(struct kvm_vcpu *vcpu) { /* Avoid costly SEAMCALL if no nmi was injected */ - if (vcpu->arch.nmi_injected) + if (vcpu->arch.nmi_injected) { vcpu->arch.nmi_injected = td_management_read8(to_tdx(vcpu), TD_VCPU_PEND_NMI); + /* + tdx module will retry injection in case of TD_VCPU_PEND_NMI, + so don't need to set KVM_REQ_EVENT for it again. + */ + if (!vcpu->arch.nmi_injected && vcpu->arch.nmi_pending) + kvm_make_request(KVM_REQ_EVENT, vcpu); + } } > > I remember there was a discussion in the following link: > https://lore.kernel.org/kvm/20240402065254.GY2444378@xxxxxxxxxxxxxxxxxxxxx/ > It said tdx_vcpu_run() will ignore force_immediate_exit. > If force_immediate_exit is igored for TDX, then the nmi_pending handling > could still be delayed if the previous NMI was injected successfully. Yes, not sure the possibility of meeting this in real use case, I know it happens in some testing, e.g. the kvm unit test's multiple NMI tesing. > > > > + } > > } > > > > > + > > > struct tdx_uret_msr { > > > u32 msr; > > > unsigned int slot; > > > @@ -663,6 +671,8 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) > > > vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET; > > > trace_kvm_exit(vcpu, KVM_ISA_VMX); > > > > > > + tdx_complete_interrupts(vcpu); > > > + > > > return EXIT_FASTPATH_NONE; > > > } > > > > > > diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h > > > index 44eab734e702..0d8a98feb58e 100644 > > > --- a/arch/x86/kvm/vmx/tdx.h > > > +++ b/arch/x86/kvm/vmx/tdx.h > > > @@ -142,6 +142,8 @@ static __always_inline void tdvps_vmcs_check(u32 field, u8 bits) > > > "Invalid TD VMCS access for 16-bit field"); > > > } > > > > > > +static __always_inline void tdvps_management_check(u64 field, u8 bits) {} > > > + > > > #define TDX_BUILD_TDVPS_ACCESSORS(bits, uclass, lclass) \ > > > static __always_inline u##bits td_##lclass##_read##bits(struct vcpu_tdx *tdx, \ > > > u32 field) \ > > > @@ -200,6 +202,8 @@ TDX_BUILD_TDVPS_ACCESSORS(16, VMCS, vmcs); > > > TDX_BUILD_TDVPS_ACCESSORS(32, VMCS, vmcs); > > > TDX_BUILD_TDVPS_ACCESSORS(64, VMCS, vmcs); > > > > > > +TDX_BUILD_TDVPS_ACCESSORS(8, MANAGEMENT, management); > > > + > > > static __always_inline u64 td_tdcs_exec_read64(struct kvm_tdx *kvm_tdx, u32 field) > > > { > > > struct tdx_module_args out; > > > -- > > > 2.25.1 > > > > > > >