Re: [PATCH v2 2/3] virt: tdx-guest: Add Quote generation support

"Huang, Kai" <kai.huang@xxxxxxxxx> · Mon, 1 May 2023 12:48:36 +0000

On Sun, 2023-04-30 at 23:03 -0700, Sathyanarayanan Kuppuswamy wrote:
> Hi Kai,
> 
> On 4/28/23 6:49 AM, Huang, Kai wrote:
> > On Wed, 2023-04-12 at 20:41 -0700, Kuppuswamy Sathyanarayanan wrote:
> > > In TDX guest, the second stage in attestation process is to send the
> > > TDREPORT to QE/QGS to generate the TD Quote. For platforms that does
> > > not support communication channels like vsock or TCP/IP, implement
> > > support to get TD Quote using hypercall. GetQuote hypercall can be used
> > > by the TD guest to request VMM facilitate the Quote generation via
> > > QE/QGS. More details about GetQuote hypercall can be found in TDX
> > > Guest-Host Communication Interface (GHCI) for Intel TDX 1.0, section
> > > titled "TDG.VP.VMCALL<GetQuote>".
> > 
> > When this patch gets merged, the patch to get the TDREPORT would be long before
> > this patch.  To help the git blamers to understand more easily, I think it's
> > better to provide some background here.
> > 
> > FYI below:
> > 
> > "
> > In TDX attestation, the TDREPORT of a TDX guest contains information to uniquely
> > identify the TDX guest along with the TEE environment of the local machine. 
> > TDREPORT is integrity-protected and can only be verified on the local machine.  
> > 
> > To support TDX remote attestation, in SGX-based attestation, after the TDX guest
> > gets the TDREPORT from the TDX module, the TDREPORT needs to be sent to the SGX
> > Quoting Enclave (QE) to convert it to a remote verifiable Quote.
> > 
> > SGX QE can only run outside of the TDX guest (i.e. in a host process or in a
> > normal VM).  For security concern the TDX guest may not support normal
> > communication channels like vsock or TCP/IP.  To support remote attestation for
> > such case, TDX uses GetQuote TDVMCALL to ask the host VMM to communicate with
> > the SGX QE.  More details about GetQuote TDVMCALL can be found in ...
> > "
> > 
> 
> Ok. I will add some introduction to the commit log.
> 
> > > 
> > > Add support for TDX_CMD_GET_QUOTE IOCTL to allow attestation agent
> > > submit GetQuote requests from the user space using GetQuote hypercall.
> > > 
> > > Since GetQuote is an asynchronous request hypercall, VMM will use
> > > callback interrupt vector configured by SetupEventNotifyInterrupt
> > > hypercall to notify the guest about Quote generation completion or
> > > failure. So register an IRQ handler for it.
> > > 
> > > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > > with TDREPORT data as input, which is further used by the VMM to copy
> > > the TD Quote result after successful Quote generation. To create the
> > > shared buffer, allocate the required memory using alloc_pages() and
> > > mark it shared using set_memory_decrypted() in tdx_guest_init(). 
> > > 
> > 
> > Personally I think you don't need to mention "using alloc_pages() ... 
> > set_memory_decrypted()" staff.  They belong to details and the code can tell.
> > 
> > > This
> > > buffer will be re-used for GetQuote requests in TDX_CMD_GET_QUOTE
> > > IOCTL handler.
> > 
> > Besides the "re-used" part, I think it's better to explain the rational of
> > choosing a fixed 16K (4 pages) shared buffer.  For instance, in practice in
> > Intel's SGX QE implementation a Quote is less than 8K, and 16K should be big
> > enough in the foreseeable future even considering 3rd party's own
> > implementation.
> 
> I will add it part of the comment in the code.
> 
> > 
> > Also, I guess it's better to call out somewhere currently we don't support
> > multiple GetQuote in parallel because of <xxx>, so allocating a single shared
> > buffer at early time is OK.
> 
> 
> > 
> > > 
> > > Although this method will reserve a fixed chunk of memory for
> > > GetQuote requests during the init time, it is preferable to the
> > > alternative choice of allocating/freeing the shared buffer in the
> > > TDX_CMD_GET_QUOTE IOCTL handler, which will damage the direct map.
> > > 
> 
> Updated commit log looks like below:
> 
>     In TDX guest, the attestation process is used to verify the TDX guest
>     trustworthiness to other entities before provisioning secrets to the
>     guest. The First step in the attestation process is TDREPORT
>     generation, which involves getting the guest measurement data in the
>     format of TDREPORT, which is further used to validate the authenticity
>     of the TDX guest. TDREPORT by design is integrity-protected and can
>     only be verified on the local machine.
>     
>     To support remote verification of the TDREPORT (in a SGX-based
>     attestation), the TDREPORT needs to be sent to the SGX Quoting Enclave
>     (QE) to convert it to a remote verifiable Quote. SGX QE by design can
>     only run outside of the TDX guest (i.e. in a host process or in a
>     normal VM) and the guest can use communication channels like vsock or
>     TCP/IP to send the TDREPORT to the QE. But for security concerns, some
>     platforms may not support these communication channels. To handle such

"platforms may not support vsock/tcp/ip etc" isn't correct.  It should be the
"TDX guest" may not support those.

The security concern mainly is about the CSP doesn't want to expose vsock/tcp/ip
to TDX guest to allow it to be able to communicate to host service (Quoting
service in our example) directly.  However the host will almost certainly
support at least tcp/ip otherwise the machine is totally isolated.

>     cases, TDX defines a GetQuote hypercall which can be used by the guest
>     to request the host VMM to communicate with the SGX QE. More details
>     about GetQuote hypercall can be found in TDX Guest-Host Communication
>     Interface (GHCI) for Intel TDX 1.0, section titled
>     "TDG.VP.VMCALL<GetQuote>".
>     
>     Add support for TDX_CMD_GET_QUOTE IOCTL to allow an attestation agent
>     to submit GetQuote requests from the user space using GetQuote
>     hypercall.
>     
>     Since GetQuote is an asynchronous request hypercall, VMM will use the
>     callback interrupt vector configured by the SetupEventNotifyInterrupt
>     hypercall to notify the guest about Quote generation completion or
>     failure. So register an IRQ handler for it.
>     
>     GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
>     with TDREPORT data as input, which is further used by the VMM to copy
>     the TD Quote result after successful Quote generation. Allocate the
>     required shared memory once in tdx_guest_init() and reuse it in the

"required" isn't very clear IMHO, because there's no mention of any requirement
at all regarding to the buffer size.  If you don't want to explain in changelog,
perhaps you need to at least mention something like "allocate a large enough
shared memory" etc.

>     TDX_CMD_GET_QUOTE IOCTL handler for GetQuote requests.
>     
>     Although this method reserves a fixed chunk of memory for GetQuote
>     requests, such one-time allocation is preferable to the alternative
>     choice of repeatedly allocating/freeing the shared buffer in the
>     TDX_CMD_GET_QUOTE IOCTL handler, which will damage the direct map
>     (because the sharing/unsharing process modifies the direct map). This
>     allocation model is similar to that used by the AMD SEV guest driver.
>     
>     Since the Quote generation process is not time-critical or frequently
>     used, the current version do not support parallel GetQuote requests.
				^
				does not

[...]

> > >  
> > > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> > > index 26f6e2eaf5c8..09b5925eec67 100644
> > > --- a/arch/x86/coco/tdx/tdx.c
> > > +++ b/arch/x86/coco/tdx/tdx.c
> > > @@ -33,6 +33,7 @@
> > >  #define TDVMCALL_MAP_GPA		0x10001
> > >  #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
> > >  #define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
> > > +#define TDVMCALL_GET_QUOTE		0x10002
> > >  
> > >  /* MMIO direction */
> > >  #define EPT_READ	0
> > > @@ -198,6 +199,45 @@ static void __noreturn tdx_panic(const char *msg)
> > >  		__tdx_hypercall(&args, 0);
> > >  }
> > >  
> > > +/**
> > > + * tdx_hcall_get_quote() - Wrapper to request TD Quote using GetQuote
> > > + *                         hypercall.
> > > + * @tdquote: Address of the direct mapped shared kernel buffer which
> > > + * 	     contains TDREPORT data. The same buffer will be used by
> > > + * 	     VMM to store the generated TD Quote output.
> > > + * @size: size of the tdquote buffer.

Better to call out @size must be in granularity of 4K.

> > > + *
> > > + * Refer to section titled "TDG.VP.VMCALL<GetQuote>" in the TDX GHCI
> > > + * v1.0 specification for more information on GetQuote hypercall.
> > > + * It is used in the TDX guest driver module to get the TD Quote.
> > > + *
> > > + * Return 0 on success or error code on failure.
> > > + */
> > > +int tdx_hcall_get_quote(u8 *tdquote, size_t size)
> > > +{
> > > +	struct tdx_hypercall_args args = {0};
> > > +
> > > +	/*
> > > +	 * TDX guest driver is the only user of this function and it uses
> > > +	 * the kernel mapped memory. So use virt_to_phys() to get the
> > > +	 * physical address of the TDQuote buffer without any additional
> > > +	 * checks for memory type.
> > > +	 */
> > 
> > How about just call out this function requires the buffer must be shared in kdoc
> > style comment above this function?  We should just focus on what this function
> > is doing.
> 
> It was suggested in the previous review to add it. A comment about why using
> virt_to_phys() is fine here. 

Sorry tried to dig but I couldn't find the comment you mentioned.  

Anyway, IMHO you can just remove this comment, because after reading again, in
fact IMHO it only confuses people: 

1) "memory type" is confusing.  I think I (as someone who is working on TDX too)
understand by "memory type" you actually mean the virtual address is from kernel
direct mapping, or it could be from vmalloc()/vmap() (do you?), but for others
it could be cache related memory type (UC, WB, etc).

2) You already said the buffer is "direct mapped shared kernel buffer" (btw,
should be "directly mapped" really) in the comment which explains @tdquote, thus
it's clear how to get the physical address.  "TDX guest driver is the only user
of this function" doesn't matter -- when there's another user in the future and
if it changes where the @tdquote comes from, then the comment of @tdquote and
the code needs to be updated anyway.

So I don't see any good of this comment.

Btw, to me @tdquote isn't a good name because it gives me the impression that
the Quote is used as input.  How about just 'buf' (which you also used in
'tdx_quote_req' structure)?

> 
> > 
> > > +	args.r10 = TDX_HYPERCALL_STANDARD;
> > > +	args.r11 = TDVMCALL_GET_QUOTE;
> > > +	args.r12 = cc_mkdec(virt_to_phys(tdquote));

Btw can we just use __pa()?  To be honest I am ignorant on the difference
between virt_to_phys() and __pa(), i.e. when should we use which.

Also, you _may_ want to add a comment why "cc_mkdec()" is used.  By the nature
of this TDVMCALL, it's obvious the buffer needs to be shared, and the VMM must
check whether the buffer is actually shared, no matter whether the "shared-bit"
is set here or not.

So to me it's just requested by the GHCI spec that we need to include the
"shared-bit", but it _seems_ the GHCI spec doesn't explicitly say we need to do
that because it only says "Shared buffer as input".  So looks a comment can help
to clarify a little bit.

> > > +	args.r13 = size;

The @size must be 4K-aligned per GHCI.  I guess we should add a check and return
early rather than depending on the TDVMCALL to fail?

> > > +
> > > +	/*
> > > +	 * Pass the physical address of TDREPORT to the VMM and
> > > +	 * trigger the Quote generation. It is not a blocking
> > > +	 * call, hence completion of this request will be notified to
> > > +	 * the TD guest via a callback interrupt.
> > > +	 */
> > > +	return __tdx_hypercall(&args, 0);
> > > +}
> > > +EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
> > > +
> > > 
> 

[...]

> > 
> > > +
> > > +/**
> > > + * struct quote_entry - Quote request struct
> > > + * @valid: Flag to check validity of the GetQuote request.
> > > + * @buf: Kernel buffer to share data with VMM (size is page aligned).
> > > + * @buf_len: Size of the buf in bytes.
> > > + * @compl: Completion object to track completion of GetQuote request.
> > > + */
> > > +struct quote_entry {
> > > +	bool valid;
> > > +	void *buf;
> > > +	size_t buf_len;
> > > +	struct completion compl;
> > > +};
> > 
> > We have a static global @qentry below.
> > 
> > The buffer size is a fixed size (16K), why do we need @buf_len here?
> 
> I have added it to support buf length changes in future (like adding a
> command line option to allow user change the GET_QUOTE_MAX_SIZE).  Also,
> IMO, using buf_len is more readable than just using GET_QUOTE_MAX_SIZE
> macro in all places.
> 
> > 
> > And why do we need @valid?  It seems ...
> 
> As a precaution against spurious event notification. I also believe that in
> the future, event notification IRQs may be used for other purposes such as
> vTPM or other TDVMCALL services, and that this handler may be triggered
> without a valid GetQuote request. So, before we process the IRQ, I want to
> make sure we have a valid buffer.

OK.  This is an shared IRQ basically, so we need to track whether we have any
GetQuote request pending.

However I am wondering whether we need a dedicated @valid for this purpose.  If
I read correctly, we will make sure the buffer is zero-ed when there's no
request pending, thus can we just use some member in 'tdx_quote_hdr' to track?

For instance, per-GHCI the 'version' must be set to 1 for a valid request.  And
I think in a foreseeable future we can also assume @in_len being the size of
TDREPORT_STRUCT.  Can we use one of them (i.e. version) for this purpose?

> 
> > 
> > > +
> > > +/* Quote data entry */
> > > +static struct quote_entry *qentry;
> > > +
> > > +/* Lock to streamline quote requests */
> > > +static DEFINE_MUTEX(quote_lock);
> > > +
> > > +static int quote_cb_handler(void *dev_id)
> > > +{
> > > +	struct quote_entry *entry = dev_id;
> > > +	struct tdx_quote_hdr *quote_hdr = entry->buf;
> > > +
> > > +	if (entry->valid && quote_hdr->status != GET_QUOTE_IN_FLIGHT)
> > > +		complete(&entry->compl);
> > 
> > ... this handler is only called when we have received the notification from the
> > VMM, so the VMM must have already put something into the buffer, meaning the
> > buffer is already valid.
> > 
> > Could you explain why do we need @valid?
> > 
> > That being said, to me I found the 'struct quote_entry' itself is quite
> > unnecessary.  It looks like a leftover that you didn't remove when changing from
> > supporting multiple GetQuote requests in parallel to only supporting one request
> > at one time.
> 
> I don't want to use multiple global values. So I have bundled all Quote related
> book keeping (completion object or buffer pointer) in the same struct.

IMHO both @buf_len and @valid are not necessary, so you just need a static
buffer plus a static completion.  And you already have a static @quote_lock.

But no very strong opinion here..

[...]

> > > +
> > > +	/* Submit GetQuote Request using GetQuote hypercall */
> > > +	ret = tdx_hcall_get_quote(qentry->buf, qentry->buf_len);
> > > +	if (ret) {
> > > +		pr_err("GetQuote hypercall failed, status:%lx\n", ret);
> > > +		ret = -EIO;
> > > +		goto quote_failed;
> > > +	}
> > > +
> > > +	/* Wait till GetQuote completion */
> > > +	wait_for_completion(&qentry->compl);
> > 
> > Non-killable wait w/o timeout worries me a little bit, because it can wait
> > forever if VMM also couldn't get the Quote for whatever reason  and doesn't have
> > it's own timeout.  Unfortunately the GHCI doesn't put any requirement to the VMM
> > on this, so we kinda depend on the VMM.
> > 
> > Perhaps for now it's OK to have this simple implementation, but looks we should
> > at least call out the risk in the comment.
> 
> How about the following comment?
> 
> /*
>  * Since TDX GHCI specification does not define a valid timeout for GetQuote
>  * requests, wait until VMM sends the completion notification. Note that there
>  * is a risk that this wait can be infinite.
>  */

This comment is missing the point I think.

There are two things actually:

1) The timeout from VMM

The GHCI should explicitly put some requirement on the VMM, i.e. say something
like "the VMM must not wait for the Quote infinitely but must signal the TDX
guest after a certain time, which can be implementation specific".  In this
case, we can be sure that the guest won't wait forever.

2) The timeout support in the GetQuote itself

This allows the guest to specify a timeout in the GetQuote TDVMCALL, so guest
can have its own control on how long to wait.

AFAICT for now we don't have requirement on 2).  What we truly want is 1),
because we certainly don't want to wait forever because of some careless VMM
implementation.

So, how about:

	/*
	 * Although the GHCI doesn't specifically put a hard requirement on the
	 * VMM that it must not wait for the Quote infinitely, a sane VMM
should
	 * always notify the guest after a certain time no matter whether
	 * getting the Quote is successful or not.  For now just depend on the
	 * VMM to do so.
	 */

[...]

> 
> > >  
> > >  static int __init tdx_guest_init(void)
> > >  {
> > > +	int ret;
> > > +
> > >  	if (!x86_match_cpu(tdx_guest_ids))
> > >  		return -ENODEV;
> > >  
> > > -	return misc_register(&tdx_misc_dev);
> > > +	ret = misc_register(&tdx_misc_dev);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	qentry = alloc_quote_entry(GET_QUOTE_MAX_SIZE);
> > > +	if (!qentry) {
> > > +		pr_err("Quote entry allocation failed\n");
> > 
> > This is a rather confusing message from user's perspective.  The result of this
> > error isn't clear from  this message.  I think we should have clear message ...
> 
> I will remove it.

I think it's fine to keep it, but my point is, besides above error msg, it's
better to explicitly print "attestation is not available", which is the result
of above error, to the user.

Whether the above message can be improved is another story.  For instance, I
believe "Failed to allocate Quote buffer" is better.

> 
> > 
> > > +		ret = -ENOMEM;
> > > +		goto free_misc;
> > > +	}
> > > +
> > > +	ret = tdx_register_event_irq_cb(quote_cb_handler, qentry);
> > > +	if (ret)
> > > +		goto free_quote;
> > > +
> > > +	return 0;
> > > +
> > > +free_quote:
> > > +	free_quote_entry(qentry);
> > > +free_misc:
> > > +	misc_deregister(&tdx_misc_dev);
> > 
> > ... here saying something like "Attestation is not available" so user can be
> > clear about this.
> > 
> > > +
> > > +	return ret;
> > >  }
> > >  module_init(tdx_guest_init);
> > >  
> > >  static void __exit tdx_guest_exit(void)
> > >  {
> > > +	tdx_unregister_event_irq_cb(quote_cb_handler, qentry);
> > > +	free_quote_entry(qentry);
> > >  	misc_deregister(&tdx_misc_dev);
> > >  }
> > >  module_exit(tdx_guest_exit);
> > > diff --git a/include/uapi/linux/tdx-guest.h b/include/uapi/linux/tdx-guest.h
> > > index a6a2098c08ff..500cdfa025ad 100644
> > > --- a/include/uapi/linux/tdx-guest.h
> > > +++ b/include/uapi/linux/tdx-guest.h
> > > @@ -17,6 +17,12 @@
> > >  /* Length of TDREPORT used in TDG.MR.REPORT TDCALL */
> > >  #define TDX_REPORT_LEN                  1024
> > >  
> > > +/* TD Quote status codes */
> > > +#define GET_QUOTE_SUCCESS               0
> > > +#define GET_QUOTE_IN_FLIGHT             0xffffffffffffffff
> > > +#define GET_QUOTE_ERROR                 0x8000000000000000
> > > +#define GET_QUOTE_SERVICE_UNAVAILABLE   0x8000000000000001
> > > +
> > >  /**
> > >   * struct tdx_report_req - Request struct for TDX_CMD_GET_REPORT0 IOCTL.
> > >   *
> > > @@ -30,6 +36,35 @@ struct tdx_report_req {
> > >  	__u8 tdreport[TDX_REPORT_LEN];
> > >  };
> > >  
> > > +/* struct tdx_quote_hdr: Format of Quote request buffer header.
> > > + * @version: Quote format version, filled by TD.
> > > + * @status: Status code of Quote request, filled by VMM.
> > > + * @in_len: Length of TDREPORT, filled by TD.
> > > + * @out_len: Length of Quote data, filled by VMM.
> > > + * @data: Quote data on output or TDREPORT on input.
> > > + *
> > > + * More details of Quote data header can be found in TDX
> > > + * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
> > > + * section titled "TDG.VP.VMCALL<GetQuote>"
> > > + */
> > > +struct tdx_quote_hdr {
> > > +	__u64 version;
> > > +	__u64 status;
> > > +	__u32 in_len;
> > > +	__u32 out_len;
> > > +	__u64 data[];
> > > +};
> > 
> > This structure is weird.  It's a header, but it contains the dynamic-size
> > buffer.  If you have __data[] in this structure, then it is already a buffer for
> > the entire Quote, no?  Then should we just call it 'struct tdx_quote'?
> > 
> > Or do you want to remove __data[]?
> 
> I can name it as struct tdx_quote_data

If go with route, why not just 'tdx_quote', or 'tdx_tdquote'?

Or, actually I think 'tdx_quote' (or 'tdx_tdquote') seems to be the format of
the _true_ Quote, so perhaps we want 'struct tdx_quote_req_buf'?

> 
> > 
> > > +
> > > +/* struct tdx_quote_req: Request struct for TDX_CMD_GET_QUOTE IOCTL.
> > > + * @buf: Address of user buffer that includes TDREPORT. Upon successful
> > > + *	 completion of IOCTL, output is copied back to the same buffer.
> > 
> > This description isn't precise.  "user buffer that includes TDREPORT" doesn't
> > tell application writer where to put the TDREPORT at all.  We need to explicitly
> > call out the buffer starts with 'tdx_quote_hdr' followed by TDREPORT
> > immediately.
> 
> I have specified it in struct tdx_quote_hdr.data help content.

Perhaps I missed something but I didn't say at any place this is clearly
documented.  The comment around @data above certainly doesn't.

Just say something like:

	@buf: The userspace pointer which points to the
	      'struct tdx_quote_req_buf' (whatever the final name)

> 
> >  
> > > + * @len: Length of the Quote buffer.
> > > + */
> > > +struct tdx_quote_req {
> > > +	__u64 buf;
> > > +	__u64 len;
> > > +};
> > > +
> > >  /*