Re: [PATCH 3/3] KVM: VMX: Initialize TDX during KVM module load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 06, 2024, Kai Huang wrote:
> On Thu, 2024-10-31 at 13:22 -0700, Sean Christopherson wrote:
> > On Thu, Oct 31, 2024, Kai Huang wrote:
> > > On Wed, 2024-10-30 at 08:19 -0700, Sean Christopherson wrote:
> > > > > +void __init tdx_bringup(void)
> > > > > +{
> > > > > +	enable_tdx = enable_tdx && !__tdx_bringup();
> > > > 
> > > > Ah.  I don't love this approach because it mixes "failure" due to an unsupported
> > > > configuration, with failure due to unexpected issues.  E.g. if enabling virtualization
> > > > fails, loading KVM-the-module absolutely should fail too, not simply disable TDX.
> > > 
> > > Thanks for the comments.
> > > 
> > > I see your point.  However for "enabling virtualization failure" kvm_init() will
> > > also try to do (default behaviour), so if it fails it will result in module
> > > loading failure eventually.  So while I guess it would be slightly better to
> > > make module loading fail if "enabling virtualization fails" in TDX, it is a nit
> > > issue to me.
> > > 
> > > I think "enabling virtualization failure" is the only "unexpected issue" that
> > > should result in module loading failure.  For any other TDX-specific
> > > initialization failure (e.g., any memory allocation in future patches) it's
> > > better to only disable TDX.
> > 
> > I disagree.  The platform owner wants TDX to be enabled, KVM shouldn't silently
> > disable TDX because of a transient, unrelated failure.
> > 
> > If TDX _can't_ be supported, e.g. because EPT or MMIO SPTE caching was explicitly
> > disable, then that's different.  And that's the general pattern throughout KVM.
> > If a requested feature isn't supported, then KVM continues on updates the module
> > param accordingly.  But if something outright fails during setup, KVM aborts the
> > entire sequence.
> > 
> > > So I can change to "make loading KVM-the-module fail if enabling virtualization
> > > fails in TDX", but I want to confirm this is what you want?
> > 
> > I would prefer the logic to be: reject loading kvm-intel.ko if an operation that
> > would normally succeed, fails.
> 
> I looked at the final tdx.c that in our development branch [*], and below is the
> list of the things that need to be done to init TDX (the code in
> __tdx_bringup()), and my thinking of whether to fail loading the module or just
> disable TDX:
> 
> 1) Early dependency check fails.  Those include: tdp_mmu_enabled,
> enable_mmio_caching, X86_FEATURE_MOVDIR64B check and check the presence of
> TSX_CTL uret MSR.
> 
> For those we can disable TDX only but continue to load module.
> 
> 2) Enable virtualization fails.
> 
> For this we fail to load module (as you suggested).
> 
> 3) Fail to register TDX cpuhp to do tdx_cpu_enable() and handle cpu hotplug.
> 
> For this we only disable TDX but continue to load module.  The reason is I think
> this is similar to enable a specific KVM feature but the hardware doesn't
> support it.  We can go further to check the return value of tdx_cpu_enable() to
> distinguish cases like "module not loaded" and "unexpected error", but I really
> don't want to go that far.

Hrm, tdx_cpu_enable() is a bit of a mess.  Ideally, there would be a separate
"probe" API so that KVM could detect if TDX is supported.  Though maybe it's the
TDX module itself is flawed, e.g. if TDH_SYS_INIT is literally the only way to
detect whether or not a module is loaded.

So, absent a way to clean up tdx_cpu_enable(), maybe disable the module param if
it returns -ENODEV, otherwise fail the module load?

> 4) tdx_enable() fails.
> 
> Ditto to 3).

No, this should fail the module load.  E.g. most of the error conditions are
-ENOMEM, which has nothing to do with host support for TDX.

> 5) tdx_get_sysinfo() fails.
> 
> This is a kernel bug since tdx_get_sysinfo() should always return valid TDX
> sysinfo structure pointer after tdx_enable() is done successfully.  Currently we
> just WARN() if the returned pointer is NULL and disable TDX only.  I think it's
> also fine.
> 
> 6) TDX global metadata check fails, e.g., MAX_VCPUS etc.
> 
> Ditto to 3).  For this we disable TDX only.

Where is this code?





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux