Re: [PATCH v14 21/22] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 26, 2024 at 12:57:08PM -0700, Sean Christopherson wrote:
> On Fri, Apr 26, 2024, Michael Roth wrote:
> > On Wed, Apr 24, 2024 at 05:15:40PM -0700, Sean Christopherson wrote:
> > > On Sun, Apr 21, 2024, Michael Roth wrote:
> > > > These commands can be used to pause servicing of guest attestation
> > > > requests. This useful when updating the reported TCB or signing key with
> > > > commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
> > > > in turn require updates to userspace-supplied certificates, and if an
> > > > attestation request happens to be in-flight at the time those updates
> > > > are occurring there is potential for a guest to receive a certificate
> > > > blob that is out of sync with the effective signing key for the
> > > > attestation report.
> > > > 
> > > > These interfaces also provide some versatility with how similar
> > > > firmware/certificate update activities can be handled in the future.
> > > 
> > > Wait, IIUC, this is using the kernel to get two userspace components to not
> > > stomp over each other.   Why is this the kernel's problem to solve?
> > 
> > It's not that they are stepping on each other, but that kernel and
> > userspace need to coordinate on updating 2 components whose updates need
> > to be atomic from a guest perspective. Take an update to VLEK key for
> > instance:
> > 
> >  1) management gets a new VLEK endorsement key from KDS along with
> 
> What is "management"?  I assume its some userspace daemon?

It could be a daemon depending on cloud provider, but the main example
we have in mind is something more basic like virtee[1] being used to
interactively perform an update at the command-line. E.g. you point it
at the new VLEK, the new cert, and it will handle updating the certs at
some known location and issuing the SNP_LOAD_VLEK command. With this
interface, it can take the additional step of PAUSE'ing attestations
before performing either update to keep the 2 actions in sync with the
guest view.

[1] https://github.com/virtee/snphost

> 
> >     associated certificate chain
> >  2) management uses SNP_VLEK_LOAD to update key
> >  3) management updates the certs at the path VMM will grab them
> >     from when the EXT_GUEST_REQUEST userspace exit is issued
> > 
> > If an attestation request comes in after 2), but before 3), then the
> > guest sees an attestation report signed with the new key, but still
> > gets the old certificate.
> > 
> > If you reverse the ordering:
> > 
> >  1) management gets a new VLEK endorsement key from KDS along with
> >     associated certificate chain
> >  2) management updates the certs at the path VMM will grab them
> >     from when the EXT_GUEST_REQUEST userspace exit is issued
> >  3) management uses SNP_VLEK_LOAD to update key
> > 
> > then an attestation request between 2) and 3) will result in the guest
> > getting the new cert, but getting an attestation report signed with an old
> > endorsement key.
> > 
> > Providing a way to pause guest attestation requests prior to 2), and
> > resume after 3), provides a straightforward way to make those updates
> > atomic to the guest.
> 
> Assuming "management" is a userspace component, I still don't see why this
> requires kernel involvement.  "management" can tell VMMs to pause attestation
> without having to bounce through the kernel.  It doesn't even require a push

That would mean a tool like virtee above would need to issue kernel
commands like SNP_LOAD_VLEK to handle key update, then implement some
VMM-specific hook to pause servicing of EXT_GUEST_REQ (or whatever we
end up calling it). QEMU could define events for this, and libvirt could
implement them, and virtee could interact with libvirt to issue them in
place of the PAUSE/RESUME approach here.

But SNP libvirt support is a ways out, QEMU event mechanism for this
will be a pain to use directly because you'd need some custom way to
enumerate all guests, to issue them. But then maybe the provider doesn't
even use QEMU and has to invent something else. Or they just decide to
pause all guests before performing updates but that still a potential
significant amount of downtime.

> without having to bounce through the kernel.  It doesn't even require a push
> model, e.g. wrap/redirect the certs with a file that has a "pause" flag and a
> sequence counter.

We could do something like flag the certificate file itself, it does
sounds less painful than the above. But what defines that spec? GHCB
completely defines the current format of the certs blob, so if we wrap
that in another layer we need to extend the GHCB or have something else
be the authority on what that wrapper looks like and tools like virtee
would need to be very selective about what VMMs it can claim to support
based on what file format they support... it just seems like a
significant and unecessary pain that every userspace implementation
will need to go through to achieve the same basic functionality.

With PAUSE/RESUME, tools like virtee can be completely VMM-agnostic, and
more highly-integrated daemon-based approaches can still benefit from a
common mechanism that doesn't require signficant coordination with VMM
processes. For something as important and basic as updating endorsement
keys while guests are running it seems worthwhile to expose this minimal
level of control to userspace.

-Mike




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux