Re: [PATCH v12 07/22] x86/virt/tdx: Add skeleton to enable TDX on demand

Dave Hansen <dave.hansen@xxxxxxxxx> · Mon, 3 Jul 2023 08:26:00 -0700

On 7/3/23 08:03, Peter Zijlstra wrote:
> On Mon, Jul 03, 2023 at 07:40:55AM -0700, Dave Hansen wrote:
>> On 7/3/23 03:49, Peter Zijlstra wrote:
>>>> There are also latency and noisy neighbor concerns, e.g. we *really* don't want
>>>> to end up in a situation where creating a TDX guest for a customer can observe
>>>> arbitrary latency *and* potentially be disruptive to VMs already running on the
>>>> host.
>>> Well, that's a quality of implementation issue with the whole TDX
>>> crapola. Sounds like we want to impose latency constraints on the
>>> various TDX calls. Allowing it to consume arbitrary amounts of CPU time
>>> is unacceptable in any case.
>>
>> For what it's worth, everybody knew that calling into the TDX module was
>> going to be a black hole and that consuming large amounts of CPU at
>> random times would drive people bat guano crazy.
>>
>> The TDX Module ABI spec does have "Leaf Function Latency" warnings for
>> some of the module calls.  But, it's basically a binary thing.  A call
>> is either normal or "longer than most".
>>
>> The majority of the "longer than most" cases are for initialization.
>> The _most_ obscene runtime ones are chunked up and can return partial
>> progress to limit latency spikes.  But I don't think folks tried as hard
>> on the initialization calls since they're only called once which
>> actually seems pretty reasonable to me.
>>
>> Maybe we need three classes of "Leaf Function Latency":
>> 1. Sane
>> 2. "Longer than most"
>> 3. Better turn the NMI watchdog off before calling this. :)
>>
>> Would that help?
> 
> I'm thikning we want something along the lines of the Xen preemptible
> hypercalls, except less crazy. Where the caller does:
> 
> 	for (;;) {
> 		ret = tdcall(fn, args);
> 		if (ret == -EAGAIN) {
> 			cond_resched();
> 			continue;
> 		}
> 		break;
> 	}
> 
> And then the TDX black box provides a guarantee that any one tdcall (or
> seamcall or whatever) never takes more than X ns (possibly even
> configurable) and we get to raise a bug report if we can prove it
> actually takes longer.

It's _supposed_ to be doing something kinda like that.  For instance, in
the places that need locking, the TDX module essentially does:

	if (!trylock(&lock))
		return -EBUSY;

which is a heck of a lot better than spinning in the TDX module.  Those
module locks are also almost always for things that *also* have some
kind of concurrency control in Linux too.

*But*, there are also the really nasty calls that *do* take forever.  It
would be great to have a list of them or, heck, even *enumeration* of
which ones can take forever so we don't need to maintain a table.