Re: [RFC PATCH v5 09/29] KVM: selftests: TDX: Add report_fatal_error test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/23/2024 5:23 AM, Sean Christopherson wrote:
On Thu, Apr 18, 2024, Yan Zhao wrote:
On Tue, Apr 16, 2024 at 11:50:19AM -0700, Sean Christopherson wrote:
On Mon, Apr 15, 2024, Yan Zhao wrote:
On Mon, Apr 15, 2024 at 08:05:49AM +0000, Ackerley Tng wrote:
The Intel GHCI Spec says in R12, bit 63 is set if the GPA is valid. As a
But above "__LINE__" is obviously not a valid GPA.

Do you think it's better to check "data_gpa" is with shared bit on and
aligned in 4K before setting bit 63?

I read "valid" in the spec to mean that the value in R13 "should be
considered as useful" or "should be passed on to the host VMM via the
TDX module", and not so much as in "validated".

We could validate the data_gpa as you suggested to check alignment and
shared bit, but perhaps that could be a higher-level function that calls
tdg_vp_vmcall_report_fatal_error()?

If it helps, shall we rename "data_gpa" to "data" for this lower-level,
generic helper function and remove these two lines

if (data_gpa)
	error_code |= 0x8000000000000000;

A higher-level function could perhaps do the validation as you suggested
and then set bit 63.
This could be all right. But I'm not sure if it would be a burden for
higher-level function to set bit 63 which is of GHCI details.

What about adding another "data_gpa_valid" parameter and then test
"data_gpa_valid" rather than test "data_gpa" to set bit 63?
Who cares what the GHCI says about validity?  The GHCI is a spec for getting
random guests to play nice with random hosts.  Selftests own both, and the goal
of selftests is to test that KVM (and KVM's dependencies) adhere to their relevant
specs.  And more importantly, KVM is NOT inheriting the GHCI ABI verbatim[*].

So except for the bits and bobs that *KVM* (or the TDX module) gets involved in,
just ignore the GHCI (or even deliberately abuse it).  To put it differently, use
selftests verify *KVM's* ABI and functionality.

As it pertains to this thread, while I haven't looked at any of this in detail,
I'm guessing that whether or not bit 63 is set is a complete "don't care", i.e.
KVM and the TDX Module should pass it through as-is.

[*] https://lore.kernel.org/all/Zg18ul8Q4PGQMWam@xxxxxxxxxx
Ok. It makes sense to KVM_EXIT_TDX.
But what if the TDVMCALL is handled in TDX specific code in kernel in future?
(not possible?)
KVM will "handle" ReportFatalError, and will do so before this code lands[*], but
I *highly* doubt KVM will ever do anything but forward the information to userspace,
e.g. as KVM_SYSTEM_EVENT_CRASH with data[] filled in with the raw register information.

Should guest set bits correctly according to GHCI?
No.  Selftests exist first and foremost to verify KVM behavior, not to verify
firmware behavior.  We can and should use selftests to verify that *KVM* doesn't
*violate* the GHCI, but that doesn't mean that selftests themselves can't ignore
and/or abuse the GCHI, especially since the GHCI definition for ReportFatalError
is frankly awful.

E.g. the GHCI prescibes actual behavior for R13, but then doesn't say *anything*
about what's in the data page.  Why!?!?!  If the format in the data page is
completely undefined, what's the point of restricting R13 to only be allowed to
hold a GPA?

The description of R13 in GHCI:
  4KB-aligned GPA where additional error data is shared by the TD. The
  VMM must validate that this GPA has the Shared bit set. In other words,
  that a shared-mapping is used, and that this is a valid mapping for the
  TD. This shared memory region is expected to hold a zero-terminated
  string.

IIUC, according the GHCI, R13 is a 4K aligned shared buffer provided by
the TDX guest to pass additional error message to VMM, i.e., it needs to
be a shared GPA.  And the content in the buffer is expected to hold a
zero-terminated string.

Do you think "a zero-terminated string" describes the format in the data
page?



And the wording is just as awful:

   The VMM must validate that this GPA has the Shared bit set. In other words,
   that a shared-mapping is used, and that this is a valid mapping for the TD.

I'm pretty sure it's just saying that the TDX module isn't going to verify the
operate, i.e. that the VMM needs to protect itself, but it would be so much
better to simply state "The TDX Module does not verify this GPA", because saying
the VMM "must" do something leads to pointless discussions like this one, where
we're debating over whether or *our* VMM should inject an error into *our* guest.

Anyways, we should do what makes sense for selftests and ignore the stupidity of
the GHCI when doing so yields better code.  If that means abusing R13, go for it.
If it's a sticking point for anyone, just use one of the "optional" registers.

Whatever we do, bury the host and guest side of selftests behind #defines or helpers
so that there are at most two pieces of code that care which register holds which
piece of information.

[*] https://lore.kernel.org/all/20240404230247.GU2444378@xxxxxxxxxxxxxxxxxxxxx






[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux