Re: [PATCH v3 04/21] x86/virt/tdx: Add skeleton for detecting and initializing TDX on demand

Dave Hansen <dave.hansen@xxxxxxxxx> · Wed, 27 Apr 2022 07:49:50 -0700

On 4/26/22 17:43, Kai Huang wrote:
> On Tue, 2022-04-26 at 13:53 -0700, Dave Hansen wrote:
>> On 4/5/22 21:49, Kai Huang wrote:
...
>>> +static bool tdx_keyid_sufficient(void)
>>> +{
>>> +	if (!cpumask_equal(&cpus_booted_once_mask,
>>> +					cpu_present_mask))
>>> +		return false;
>>
>> I'd move this cpumask_equal() to a helper.
> 
> Sorry to double confirm, do you want something like:
> 
> static bool tdx_detected_on_all_cpus(void)
> {
> 	/*
> 	 * To detect any BIOS misconfiguration among cores, all logical
> 	 * cpus must have been brought up at least once.  This is true
> 	 * unless 'maxcpus' kernel command line is used to limit the
> 	 * number of cpus to be brought up during boot time.  However
> 	 * 'maxcpus' is basically an invalid operation mode due to the
> 	 * MCE broadcast problem, and it should not be used on a TDX
> 	 * capable machine.  Just do paranoid check here and do not
> 	 * report SEAMRR as enabled in this case.
> 	 */
> 	return cpumask_equal(&cpus_booted_once_mask, cpu_present_mask);
> }

That's logically the right idea, but I hate the name since the actual
test has nothing to do with TDX being detected.  The comment is also
rather verbose and rambling.

It should be named something like:

	all_cpus_booted()

and with a comment like this:

/*
 * To initialize TDX, the kernel needs to run some code on every
 * present CPU.  Detect cases where present CPUs have not been
 * booted, like when maxcpus=N is used.
 */

> static bool seamrr_enabled(void)
> {
> 	if (!tdx_detected_on_all_cpus())
> 		return false;
> 
> 	return __seamrr_enabled();
> }
> 
> static bool tdx_keyid_sufficient()
> {
> 	if (!tdx_detected_on_all_cpus())
> 		return false;
> 
> 	...
> }

Although, looking at those, it's *still* unclear why you need this.  I
assume it's because some later TDX SEAMCALL will fail if you get this
wrong, and you want to be able to provide a better error message.

*BUT* this code doesn't actually provide halfway reasonable error
messages.  If someone uses maxcpus=99, then this code will report:

	pr_info("SEAMRR not enabled.\n");

right?  That's bonkers.

>>> +	/*
>>> +	 * TDX requires at least two KeyIDs: one global KeyID to
>>> +	 * protect the metadata of the TDX module and one or more
>>> +	 * KeyIDs to run TD guests.
>>> +	 */
>>> +	return tdx_keyid_num >= 2;
>>> +}
>>> +
>>> +static int __tdx_detect(void)
>>> +{
>>> +	/* The TDX module is not loaded if SEAMRR is disabled */
>>> +	if (!seamrr_enabled()) {
>>> +		pr_info("SEAMRR not enabled.\n");
>>> +		goto no_tdx_module;
>>> +	}
>>
>> Why even bother with the SEAMRR stuff?  It sounded like you can "ping"
>> the module with SEAMCALL.  Why not just use that directly?
> 
> SEAMCALL will cause #GP if SEAMRR is not enabled.  We should check whether
> SEAMRR is enabled before making SEAMCALL.

So...  You could actually get rid of all this code.  if SEAMCALL #GP's,
then you say, "Whoops, the firmware didn't load the TDX module
correctly, sorry."

Why is all this code here?  What is it for?

>>> +	/*
>>> +	 * Also do not report the TDX module as loaded if there's
>>> +	 * no enough TDX private KeyIDs to run any TD guests.
>>> +	 */
>>> +	if (!tdx_keyid_sufficient()) {
>>> +		pr_info("Number of TDX private KeyIDs too small: %u.\n",
>>> +				tdx_keyid_num);
>>> +		goto no_tdx_module;
>>> +	}
>>> +
>>> +	/* Return -ENODEV until the TDX module is detected */
>>> +no_tdx_module:
>>> +	tdx_module_status = TDX_MODULE_NONE;
>>> +	return -ENODEV;
>>> +}

Again, if someone uses maxcpus=1234 and we get down here, then it
reports to the user:

	Number of TDX private KeyIDs too small: ...

????  When the root of the problem has nothing to do with KeyIDs.

>>> +static int init_tdx_module(void)
>>> +{
>>> +	/*
>>> +	 * Return -EFAULT until all steps of TDX module
>>> +	 * initialization are done.
>>> +	 */
>>> +	return -EFAULT;
>>> +}
>>> +
>>> +static void shutdown_tdx_module(void)
>>> +{
>>> +	/* TODO: Shut down the TDX module */
>>> +	tdx_module_status = TDX_MODULE_SHUTDOWN;
>>> +}
>>> +
>>> +static int __tdx_init(void)
>>> +{
>>> +	int ret;
>>> +
>>> +	/*
>>> +	 * Logical-cpu scope initialization requires calling one SEAMCALL
>>> +	 * on all logical cpus enabled by BIOS.  Shutting down the TDX
>>> +	 * module also has such requirement.  Further more, configuring
>>> +	 * the key of the global KeyID requires calling one SEAMCALL for
>>> +	 * each package.  For simplicity, disable CPU hotplug in the whole
>>> +	 * initialization process.
>>> +	 *
>>> +	 * It's perhaps better to check whether all BIOS-enabled cpus are
>>> +	 * online before starting initializing, and return early if not.
>>
>> But you did some of this cpumask checking above.  Right?
> 
> Above check only guarantees SEAMRR/TDX KeyID has been detected on all presnet
> cpus.  the 'present' cpumask doesn't equal to all BIOS-enabled CPUs.

I have no idea what this is saying.  In general, I have no idea what the
comment is saying.  It makes zero sense.  The locking pattern for stuff
like this is:

	cpus_read_lock();

	for_each_online_cpu()
		do_something();

	cpus_read_unlock();

because you need to make sure that you don't miss "do_something()" on a
CPU that comes online during the loop.

But, now that I think about it, all of the checks I've seen so far are
for *booted* CPUs.  While the lock (I assume) would keep new CPUs from
booting, it doesn't do any good really since the "cpus_booted_once_mask"
bits are only set and not cleared.  A CPU doesn't un-become booted once.

Again, we seem to have a long, verbose comment that says very little and
only confuses me.

...
>> Why does this need both a tdx_detect() and a tdx_init()?  Shouldn't the
>> interface from outside just be "get TDX up and running, please?"
> 
> We can have a single tdx_init().  However tdx_init() can be heavy, and having a
> separate non-heavy tdx_detect() may be useful if caller wants to separate
> "detecting the TDX module" and "initializing the TDX module", i.e. to do
> something in the middle.

<Sigh>  So, this "design" went unmentioned, *and* I can't review if the
actual callers of this need the functionality or not because they're not
in this series.

> However tdx_detect() basically only detects P-SEAMLDR.  If we move P-SEAMLDR
> detection to tdx_init(), or we git rid of P-SEAMLDR completely, then we don't
> need tdx_detect() anymore.  We can expose seamrr_enabled() and TDX KeyID
> variables or functions so caller can use them to see whether it should do TDX
> related staff and then call tdx_init().

I don't think you've made a strong case for why P-SEAMLDR detection is
even necessary in this series.