On Mon, Mar 13, 2023 at 02:49:03PM +0000, "Wang, Wei W" <wei.w.wang@xxxxxxxxx> wrote: > On Monday, March 13, 2023 1:55 AM, isaku.yamahata@xxxxxxxxx wrote: > > Currently, KVM VMX module initialization/exit functions are a single function > > each. Refactor KVM VMX module initialization functions into KVM common > > part and VMX part so that TDX specific part can be added cleanly. > > Opportunistically refactor module exit function as well. > > > > The current module initialization flow is, > > 0.) Check if VMX is supported, > > 1.) hyper-v specific initialization, > > 2.) system-wide x86 specific and vendor specific initialization, > > 3.) Final VMX specific system-wide initialization, > > 4.) calculate the sizes of VMX kvm structure and VMX vcpu structure, > > 5.) report those sizes to the KVM common layer and KVM common > > initialization > > > > Refactor the KVM VMX module initialization function into functions with a > > wrapper function to separate VMX logic in vmx.c from a file, main.c, common > > among VMX and TDX. Introduce a wrapper function for vmx_init(). > > > > The KVM architecture common layer allocates struct kvm with reported size for > > architecture-specific code. The KVM VMX module defines its structure as > > struct vmx_kvm { struct kvm; VMX specific members;} and uses it as struct vmx > > kvm. Similar for vcpu structure. TDX KVM patches will define TDX specific kvm > > and vcpu structures. > > > > The current module exit function is also a single function, a combination of > > VMX specific logic and common KVM logic. Refactor it into VMX specific logic > > and KVM common logic. This is just refactoring to keep the VMX specific logic > > in vmx.c from main.c. > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > --- > > arch/x86/kvm/vmx/main.c | 51 +++++++++++++++++++++++++++++++++++ > > arch/x86/kvm/vmx/vmx.c | 54 +++++--------------------------------- > > arch/x86/kvm/vmx/x86_ops.h | 13 ++++++++- > > 3 files changed, 69 insertions(+), 49 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index > > a59559ff140e..3f49e8e38b6b 100644 > > --- a/arch/x86/kvm/vmx/main.c > > +++ b/arch/x86/kvm/vmx/main.c > > @@ -165,3 +165,54 @@ struct kvm_x86_init_ops vt_init_ops __initdata = { > > .runtime_ops = &vt_x86_ops, > > .pmu_ops = &intel_pmu_ops, > > }; > > + > > +static int __init vt_init(void) > > +{ > > + unsigned int vcpu_size, vcpu_align; > > + int r; > > + > > + if (!kvm_is_vmx_supported()) > > + return -EOPNOTSUPP; > > + > > + /* > > + * Note, hv_init_evmcs() touches only VMX knobs, i.e. there's nothing > > + * to unwind if a later step fails. > > + */ > > + hv_init_evmcs(); > > + > > + r = kvm_x86_vendor_init(&vt_init_ops); > > + if (r) > > + return r; > > + > > + r = vmx_init(); > > + if (r) > > + goto err_vmx_init; > > + > > + /* > > + * Common KVM initialization _must_ come last, after this, /dev/kvm is > > + * exposed to userspace! > > + */ > > + vt_x86_ops.vm_size = sizeof(struct kvm_vmx); > > + vcpu_size = sizeof(struct vcpu_vmx); > > + vcpu_align = __alignof__(struct vcpu_vmx); > > + r = kvm_init(vcpu_size, vcpu_align, THIS_MODULE); > > + if (r) > > + goto err_kvm_init; > > + > > + return 0; > > + > > +err_kvm_init: > > + vmx_exit(); > > +err_vmx_init: > > + kvm_x86_vendor_exit(); > > + return r; > > +} > > +module_init(vt_init); > > I had a patch to fix a bug here, maybe you can take it: > > kvm_x86_vendor_init copies vt_x86_ops to kvm_x86_ops. vt_x86_ops.vm_size > needs to be updated before calling kvm_x86_vendor_init so that kvm_x86_ops > can get the correct vm_size. Thanks for catching it. With your patch, vm_size is always max(sizeof struct kvm_vmx, sizeof strut kvm_tdx) even when the admin sets kvm_intel.tdx=true and tdx is disabled by error. option 1: Ignore such waste. Your patch. The difference is small and it's only the error case. Locally I have the following values. sizeof(struct kvm_vmx) = 44576 sizeof(struct vcpu_vmx) = 10432 sizeof(struct kvm_tdx)= 44632 sizeof(struct vcpu_tdx) = 8192 I suspect the actual allocation size for struct kvm is same. That's the reason why I didn't hit problem. option 2: Explicitly update vm_size after kvm_x86_vendor_init() struct kvm_x86_ops isn't exported. It would be ugly. option 3: Allow setup_hardware() to update vm_size. setup_hardware(void) => setup_hardware(unsigned int *vm_size) It's confusing because kvm_x86_ops.vm_size is already initialized. Let's go with option 1(your patch). -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>