On Mon, 2023-12-04 at 14:04 -0800, Hansen, Dave wrote: > On 12/4/23 13:00, Huang, Kai wrote: > > > tl;dr: I think even looking a #MC on the PAMT after the kvm module is > > > removed is a fool's errand. > > Sorry I wasn't clear enough. KVM actually turns off VMX when it destroys the > > last VM, so the KVM module doesn't need to be removed to turn off VMX. I used > > "KVM can be unloaded" as an example to explain the PAMT can be working when VMX > > is off. > > Can't we just fix this by having KVM do an "extra" hardware_enable_all() > before initializing the TDX module? > Yes KVM needs to do hardware_enable_all() anyway before initializing the TDX module. I believe you mean we can keep VMX enabled after initializing the TDX module, i.e., not calling hardware_disable_all() after that, so that kvm_usage_count will remain non-zero even when last VM is destroyed? The current behaviour that KVM only enable VMX when there's active VM is because it (or the kernel) wants to allow to be able to load and run third-party VMX module (yes the virtual BOX) when KVM module is loaded. Only one of them can actually use the VMX hardware but they can be both loaded. In ancient time KVM used to immediately enable VMX when it is loaded, but later it was changed to only enable VMX when there's active VM because of the above reason. See commit 10474ae8945ce ("KVM: Activate Virtualization On Demand"). > It's not wrong to say that TDX is a > KVM user. If KVm wants 'kvm_usage_count' to go back to 0, it can shut > down the TDX module. Then there's no PAMT to worry about. > > The shutdown would be something like: > > 1. TDX module shutdown > 2. Deallocate/Convert PAMT > 3. vmxoff > > Then, no SEAMCALL failure because of vmxoff can cause a PAMT-induced #MC > to be missed. The limitation is once the TDX module is shutdown, it cannot be initialized again unless it is runtimely updated. Long-termly, if we go this design then there might be other problems when other kernel components are using TDX. For example, the VT-d driver will need to be changed to support TDX-IO, and it will need to enable TDX module much earlier than KVM to do some initialization. It might need to some TDX work (e.g., cleanup) while KVM is unloaded. I am not super familiar with TDX-IO but looks we might have some problem here if we go with such design.