Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


> > > And the intent isn't to catch every possible problem.  As with many sanity checks,
> > > the intent is to detect the most likely failure mode to make triaging and debugging
> > > issues a bit easier.
> > 
> > The SEAMCALL will literally return a unique error code to indicate CPU
> > isn't in post-VMXON, or tdx_cpu_enable() hasn't been done.  I think the
> > error code is already clear to pinpoint the problem (due to these pre-
> > SEAMCALL-condition not being met).
> No, SEAMCALL #UDs if the CPU isn't post-VMXON.  I.e. the CPU doesn't make it to
> the TDX Module to provide a unique error code, all KVM will see is a #UD.

#UD is handled by the SEAMCALL assembly code.  Please see TDX_MODULE_CALL
assembly macro:

         * SEAMCALL caused #GP or #UD.  By reaching here RAX contains
         * the trap number.  Convert the trap number to the TDX error
         * code by setting TDX_SW_ERROR to the high 32-bits of RAX.
         * Note cannot OR TDX_SW_ERROR directly to RAX as OR instruction
         * only accepts 32-bit immediate at most.
        movq $TDX_SW_ERROR, %rdi
        orq  %rdi, %rax

	_ASM_EXTABLE_FAULT(.Lseamcall\@, .Lseamcall_trap\@)
.endif  /* \host */

> > > > Btw, I noticed there's another problem, that is currently tdx_cpu_enable()
> > > > actually requires IRQ being disabled.  Again it was implemented based on
> > > > it would be invoked via both on_each_cpu() and kvm_online_cpu().
> > > > 
> > > > It also also implemented with consideration that it could be called by
> > > > multiple in-kernel TDX users in parallel via both SMP call and in normal
> > > > context, so it was implemented to simply request the caller to make sure
> > > > it is called with IRQ disabled so it can be IRQ safe  (it uses a percpu
> > > > variable to track whether TDH.SYS.LP.INIT has been done for local cpu
> > > > similar to the hardware_enabled percpu variable).
> > > 
> > > Is this is an actual problem, or is it just something that would need to be
> > > updated in the TDX code to handle the change in direction?
> > 
> > For now this isn't, because KVM is the solo user, and in KVM
> > hardware_enable_all() and kvm_online_cpu() uses kvm_lock mutex to make
> > hardware_enable_nolock() IPI safe.
> > 
> > I am not sure how TDX/SEAMCALL will be used in TDX Connect.
> > 
> > However I needed to consider KVM as a user, so I decided to just make it
> > must be called with IRQ disabled so I could know it is IRQ safe.
> > 
> > Back to the current tdx_enable() and tdx_cpu_enable(), my personal
> > preference is, of course, to keep the existing way, that is:
> > 
> > During module load:
> > 
> > 	cpus_read_lock();
> > 	tdx_enable();
> > 	cpus_read_unlock();
> > 
> > and in kvm_online_cpu():
> > 
> > 	local_irq_save();
> > 	tdx_cpu_enable();
> > 	local_irq_restore();
> > 
> > But given KVM is the solo user now, I am also fine to change if you
> > believe this is not acceptable.
> Looking more closely at the code, tdx_enable() needs to be called under
> cpu_hotplug_lock to prevent *unplug*, i.e. to prevent the last CPU on a package
> from being offlined.  I.e. that part's not option.

Yeah.  We can say that.  I almost forgot this :-)

> And the root of the problem/confusion is that the APIs provided by the core kernel
> are weird, which is really just a polite way of saying they are awful :-)

Well, apologize for it :-)

> There is no reason to rely on the caller to take cpu_hotplug_lock, and definitely
> no reason to rely on the caller to invoke tdx_cpu_enable() separately from invoking
> tdx_enable().  I suspect they got that way because of KVM's unnecessarily complex
> code, e.g. if KVM is already doing on_each_cpu() to do VMXON, then it's easy enough
> to also do TDH_SYS_LP_INIT, so why do two IPIs?

The main reason is we relaxed the TDH.SYS.LP.INIT to be called _after_ TDX
module initialization.  

Previously, the TDH.SYS.LP.INIT must be done on *ALL* CPUs that the
platform has (i.e., cpu_present_mask) right after TDH.SYS.INIT and before
any other SEAMCALLs.  This didn't quite work with (kernel software) CPU
hotplug, and it had problem dealing with things like SMT disable

So the x86 maintainers requested to change this.  The original proposal
was to eliminate the entire TDH.SYS.INIT and TDH.SYS.LP.INIT:

But somehow it wasn't feasible, and the result was we relaxed to allow
TDH.SYS.LP.INIT to be called after module initialization.

So we need a separate tdx_cpu_enable() for that.

> But just because KVM "owns" VMXON doesn't mean the core kernel code should punt
> TDX to KVM too.  If KVM relies on the cpuhp code to ensure all online CPUs are
> post-VMXON, then the TDX shapes up nicely and provides a single API to enable
> TDX.  

We could ask tdx_enable() to invoke tdx_cpu_enable() internally, but as
mentioned above we still to have the tdx_cpu_enable() as a separate API to
allow CPU hotplug to call it.

> And then my CR4.VMXE=1 sanity check makes a _lot_ more sense.

As replied above the current SEAMCALL assembly returns a unique error code
for that:


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux