Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Mon, 9 Sep 2024 22:34:55 +0000

On Mon, 2024-09-09 at 14:23 -0700, Sean Christopherson wrote:
> > In general, I am _very_ opposed to blindly retrying an SEPT SEAMCALL, ever. 
> > For
> > its operations, I'm pretty sure the only sane approach is for KVM to ensure
> > there
> > will be no contention.  And if the TDX module's single-step protection
> > spuriously
> > kicks in, KVM exits to userspace.  If the TDX module can't/doesn't/won't
> > communicate
> > that it's mitigating single-step, e.g. so that KVM can forward the
> > information
> > to userspace, then that's a TDX module problem to solve.
> > 
> > > Per the docs, in general the VMM is supposed to retry SEAMCALLs that
> > > return
> > > TDX_OPERAND_BUSY.
> > 
> > IMO, that's terrible advice.  SGX has similar behavior, where the xucode
> > "module"
> > signals #GP if there's a conflict.  #GP is obviously far, far worse as it
> > lacks
> > the precision that would help software understand exactly what went wrong,
> > but I
> > think one of the better decisions we made with the SGX driver was to have a
> > "zero tolerance" policy where the driver would _never_ retry due to a
> > potential
> > resource conflict, i.e. that any conflict in the module would be treated as
> > a
> > kernel bug.

Thanks for the analysis. The direction seems reasonable to me for this lock in
particular. We need to do some analysis on how much the existing mmu_lock can
protects us. Maybe sprinkle some asserts for documentation purposes.

For the general case of TDX_OPERAND_BUSY, there might be one wrinkle. The guest
side operations can take the locks too. From "Base Architecture Specification":
"
Host-Side (SEAMCALL) Operation
------------------------------
The host VMM is expected to retry host-side operations that fail with a
TDX_OPERAND_BUSY status. The host priority mechanism helps guarantee that at
most after a limited time (the longest guest-side TDX module flow) there will be
no contention with a guest TD attempting to acquire access to the same resource.

Lock operations process the HOST_PRIORITY bit as follows:
   - A SEAMCALL (host-side) function that fails to acquire a lock sets the lock’s
   HOST_PRIORITY bit and returns a TDX_OPERAND_BUSY status to the host VMM. It is
   the host VMM’s responsibility to re-attempt the SEAMCALL function until is
   succeeds; otherwise, the HOST_PRIORITY bit remains set, preventing the guest TD
   from acquiring the lock.
   - A SEAMCALL (host-side) function that succeeds to acquire a lock clears the
   lock’s HOST_PRIORITY bit.

Guest-Side (TDCALL) Operation
-----------------------------
A TDCALL (guest-side) function that attempt to acquire a lock fails if
HOST_PRIORITY is set to 1; a TDX_OPERAND_BUSY status is returned to the guest.
The guest is expected to retry the operation.

Guest-side TDCALL flows that acquire a host priority lock have an upper bound on
the host-side latency for that lock; once a lock is acquired, the flow either
releases within a fixed upper time bound, or periodically monitor the
HOST_PRIORITY flag to see if the host is attempting to acquire the lock.
"

So KVM can't fully prevent TDX_OPERAND_BUSY with KVM side locks, because it is
involved in sorting out contention between the guest as well. We need to double
check this, but I *think* this HOST_PRIORITY bit doesn't come into play for the
functionality we need to exercise for base support.

The thing that makes me nervous about retry based solution is the potential for
some kind deadlock like pattern. Just to gather your opinion, if there was some
SEAMCALL contention that couldn't be locked around from KVM, but came with some
strong well described guarantees, would a retry loop be hard NAK still?