On Mon, 2024-04-22 at 09:54 -0700, Sean Christopherson wrote: > On Mon, Apr 22, 2024, Kai Huang wrote: > > On Fri, 2024-04-19 at 10:23 -0700, Sean Christopherson wrote: > > > On Fri, Apr 19, 2024, Kai Huang wrote: > > > And tdx_enable() should also do its best to verify that the caller is post-VMXON: > > > > > > if (WARN_ON_ONCE(!(__read_cr4() & X86_CR4_VMXE))) > > > return -EINVAL; > > > > This won't be helpful, or at least isn't sufficient. > > > > tdx_enable() can SEAMCALLs on all online CPUs, so checking "the caller is > > post-VMXON" isn't enough. It needs "checking all online CPUs are in post- > > VMXON with tdx_cpu_enable() having been done". > > I'm suggesting adding it in the responding code that does that actual SEAMCALL. The thing is to check you will need to do additional things like making sure no scheduling would happen during "check + make SEAMCALL". Doesn't seem worth to do for me. The intent of tdx_enable() was the caller should make sure no new CPU should come online (well this can be relaxed if we move cpuhp_setup_state() to hardware_enable_all()), and all existing online CPUs are in post-VMXON with tdx_cpu_enable() been done. I think, if we ever need any check, the latter seems to be more reasonable. But if we allow new CPU to become online during tdx_enable() (with your enhancement to the hardware enabling), then I don't know how to make such check at the beginning of tdx_enable(), because do on_each_cpu(check_seamcall_precondition, NULL, 1); cannot catch any new CPU during tdx_enable(). > > And the intent isn't to catch every possible problem. As with many sanity checks, > the intent is to detect the most likely failure mode to make triaging and debugging > issues a bit easier. The SEAMCALL will literally return a unique error code to indicate CPU isn't in post-VMXON, or tdx_cpu_enable() hasn't been done. I think the error code is already clear to pinpoint the problem (due to these pre- SEAMCALL-condition not being met). > > > I didn't add such check because it's not mandatory, i.e., the later > > SEAMCALL will catch such violation. > > Yeah, a sanity check definitely isn't manadatory, but I do think it would be > useful and worthwhile. The code in question is relatively unique (enables VMX > at module load) and a rare operation, i.e. the cost of sanity checking CR4.VMXE > is meaningless. If we do end up with a bug where a CPU fails to do VMXON, this > sanity check would give a decent chance of a precise report, whereas #UD on a > SEAMCALL will be less clearcut. If VMXON fails for any CPU then cpuhp_setup_state() will fail, preventing KVM to be loaded. And if it fails during kvm_online_cpu(), the new CPU will fail to online. > > > Btw, I noticed there's another problem, that is currently tdx_cpu_enable() > > actually requires IRQ being disabled. Again it was implemented based on > > it would be invoked via both on_each_cpu() and kvm_online_cpu(). > > > > It also also implemented with consideration that it could be called by > > multiple in-kernel TDX users in parallel via both SMP call and in normal > > context, so it was implemented to simply request the caller to make sure > > it is called with IRQ disabled so it can be IRQ safe (it uses a percpu > > variable to track whether TDH.SYS.LP.INIT has been done for local cpu > > similar to the hardware_enabled percpu variable). > > Is this is an actual problem, or is it just something that would need to be > updated in the TDX code to handle the change in direction? For now this isn't, because KVM is the solo user, and in KVM hardware_enable_all() and kvm_online_cpu() uses kvm_lock mutex to make hardware_enable_nolock() IPI safe. I am not sure how TDX/SEAMCALL will be used in TDX Connect. However I needed to consider KVM as a user, so I decided to just make it must be called with IRQ disabled so I could know it is IRQ safe. Back to the current tdx_enable() and tdx_cpu_enable(), my personal preference is, of course, to keep the existing way, that is: During module load: cpus_read_lock(); tdx_enable(); cpus_read_unlock(); and in kvm_online_cpu(): local_irq_save(); tdx_cpu_enable(); local_irq_restore(); But given KVM is the solo user now, I am also fine to change if you believe this is not acceptable.