So the tdh_vp_flush should always succeed while vm is being torn down. Thanks Isaku for the explanation, and I think it would be great to add the error message. -Erdem On Wed, Mar 23, 2022 at 12:08 PM Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote: > > On Tue, Mar 22, 2022 at 05:54:45PM -0700, > Erdem Aktas <erdemaktas@xxxxxxxxxx> wrote: > > > On Fri, Mar 4, 2022 at 11:50 AM <isaku.yamahata@xxxxxxxxx> wrote: > > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > > > index a6b1a8ce888d..690298fb99c7 100644 > > > --- a/arch/x86/kvm/vmx/tdx.c > > > +++ b/arch/x86/kvm/vmx/tdx.c > > > @@ -48,6 +48,14 @@ struct tdx_capabilities tdx_caps; > > > static DEFINE_MUTEX(tdx_lock); > > > static struct mutex *tdx_mng_key_config_lock; > > > > > > +/* > > > + * A per-CPU list of TD vCPUs associated with a given CPU. Used when a CPU > > > + * is brought down to invoke TDH_VP_FLUSH on the approapriate TD vCPUS. > > > + * Protected by interrupt mask. This list is manipulated in process context > > > + * of vcpu and IPI callback. See tdx_flush_vp_on_cpu(). > > > + */ > > > +static DEFINE_PER_CPU(struct list_head, associated_tdvcpus); > > > + > > > static u64 hkid_mask __ro_after_init; > > > static u8 hkid_start_pos __ro_after_init; > > > > > > @@ -87,6 +95,8 @@ static inline bool is_td_finalized(struct kvm_tdx *kvm_tdx) > > > > > > static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu) > > > { > > > + list_del(&to_tdx(vcpu)->cpu_list); > > > + > > > /* > > > * Ensure tdx->cpu_list is updated is before setting vcpu->cpu to -1, > > > * otherwise, a different CPU can see vcpu->cpu = -1 and add the vCPU > > > @@ -97,6 +107,22 @@ static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu) > > > vcpu->cpu = -1; > > > } > > > > > > +void tdx_hardware_enable(void) > > > +{ > > > + INIT_LIST_HEAD(&per_cpu(associated_tdvcpus, raw_smp_processor_id())); > > > +} > > > + > > > +void tdx_hardware_disable(void) > > > +{ > > > + int cpu = raw_smp_processor_id(); > > > + struct list_head *tdvcpus = &per_cpu(associated_tdvcpus, cpu); > > > + struct vcpu_tdx *tdx, *tmp; > > > + > > > + /* Safe variant needed as tdx_disassociate_vp() deletes the entry. */ > > > + list_for_each_entry_safe(tdx, tmp, tdvcpus, cpu_list) > > > + tdx_disassociate_vp(&tdx->vcpu); > > > +} > > > + > > > static void tdx_clear_page(unsigned long page) > > > { > > > const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); > > > @@ -230,9 +256,11 @@ void tdx_mmu_prezap(struct kvm *kvm) > > > struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > > > cpumask_var_t packages; > > > bool cpumask_allocated; > > > + struct kvm_vcpu *vcpu; > > > u64 err; > > > int ret; > > > int i; > > > + unsigned long j; > > > > > > if (!is_hkid_assigned(kvm_tdx)) > > > return; > > > @@ -248,6 +276,17 @@ void tdx_mmu_prezap(struct kvm *kvm) > > > return; > > > } > > > > > > + kvm_for_each_vcpu(j, vcpu, kvm) > > > + tdx_flush_vp_on_cpu(vcpu); > > > + > > > + mutex_lock(&tdx_lock); > > > + err = tdh_mng_vpflushdone(kvm_tdx->tdr.pa); > > > > Hi Isaku, > > Hi. > > > > I am wondering about the impact of the failures on these functions. Is > > there any other function which recovers any failures here? > > When I look at the tdx_flush_vp function, it seems like it can fail > > due to task migration so tdx_flush_vp_on_cpu might also fail and if it > > fails, tdh_mng_vpflushdone returns err. Since tdx_vm_teardown does not > > return any error , how the VMM can free the keyid used in this TD. > > Will they be forever in "used state"? > > Also if tdx_vm_teardown fails, the kvm_tdx->hkid is never set to -1 > > which will prevent tdx_vcpu_free to free and reclaim the resources > > allocated for the vcpu. > > mmu_prezap() is called via release callback of mmu notifier when the last mmu > reference of this process is dropped. It is after all kvm vcpu fd and kvm vm > fd were closed. vcpu will never run. But we still hold kvm_vcpu structures. > There is no race between tdh_vp_flush()/tdh_mng_vpflushdone() here and process > migration. tdh_vp_flush()/tdh_mng_vp_flushdone() should success. > > The cpuid check in tdx_flush_vp() is for vcpu_load() which may race with process > migration. > > Anyway what if one of those TDX seamcalls fails? HKID is leaked and will be > never used because there is no good way to free and use HKID safely. Such > failure is due to unknown issue and probably a bug. > > One mitigation is to add pr_err() when HKID leak happens. I'll add such message > on next respin. > > thanks, > -- > Isaku Yamahata <isaku.yamahata@xxxxxxxxx>