Re: [RFC PATCH 11/21] KVM: SEV: Add TIO VMGEXIT and bind TDI

Xu Yilun <yilun.xu@xxxxxxxxxxxxxxx> · Wed, 18 Sep 2024 18:45:14 +0800

On Sat, Sep 14, 2024 at 08:19:46AM +0300, Zhi Wang wrote:
> On Sat, 14 Sep 2024 02:47:27 +0000
> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> 
> > > From: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > Sent: Saturday, September 14, 2024 6:09 AM
> > > 
> > > Zhi Wang wrote:
> > > > On Fri, 23 Aug 2024 23:21:25 +1000
> > > > Alexey Kardashevskiy <aik@xxxxxxx> wrote:
> > > >
> > > > > The SEV TIO spec defines a new TIO_GUEST_MESSAGE message to
> > > > > provide a secure communication channel between a SNP VM and
> > > > > the PSP.
> > > > >
> > > > > The defined messages provide way to read TDI info and do secure
> > > > > MMIO/DMA setup.
> > > > >
> > > > > On top of this, GHCB defines an extension to return
> > > > > certificates/ measurements/report and TDI run status to the VM.
> > > > >
> > > > > The TIO_GUEST_MESSAGE handler also checks if a specific TDI
> > > > > bound to the VM and exits the KVM to allow the userspace to
> > > > > bind it.
> > > > >
> > > >
> > > > Out of curiosity, do we have to handle the TDI bind/unbind in the
> > > > kernel space? It seems we are get the relationship between
> > > > modules more complicated. What is the design concern that letting
> > > > QEMU to handle the TDI bind/unbind message, because QEMU can talk
> > > > to VFIO/KVM and also
> > > TSM.
> > > 
> > > Hmm, the flow I have in mind is:
> > > 
> > > Guest GHCx(BIND) => KVM => TSM GHCx handler => VFIO state update +
> > > TSM low-level BIND
> > > 
> > > vs this: (if I undertand your question correctly?)
> > > 
> > > Guest GHCx(BIND) => KVM => TSM GHCx handler => QEMU => VFIO => TSM
> > > low-level BIND
> > 
> > Reading this patch appears that it's implemented this way except QEMU
> > calls a KVM_DEV uAPI instead of going through VFIO (as Yilun
> > suggested).
> > 
> > > 
> > > Why exit to QEMU only to turn around and call back into the kernel?
> > > VFIO should already have the context from establishing the vPCI
> > > device as "bind-capable" at setup time.

Previously we tried to do host side "bind-capable" setup (TDI context
creation required by firmware but no LOCK) at setup time. But didn't
see enough value, only to make the error recovery flow more complex. So
now I actually didn't see much work to do for "bind-capable", just to
mark the device as can-be-private. I.e. the context from establishing
the vPCI device are moved to GHCx BIND phase.

> > > 
> > 
> > The general practice in VFIO is to design things around userspace
> > driver control over the device w/o assuming the existence of KVM.
> > When VMM comes to the picture the interaction with KVM is minimized
> > unless for functional or perf reasons.
> > 
> > e.g. KVM needs to know whether an assigned device allows non-coherent
> > DMA for proper cache control, or mdev/new vIOMMU object needs
> > a reference to struct kvm, etc. 
> > 
> > sometimes frequent trap-emulates is too costly then KVM/VFIO may
> > enable in-kernel acceleration to skip Qemu via eventfd, but in 
> > this case the slow-path via Qemu has been firstly implemented.
> > 
> > Ideally BIND/UNBIND is not a frequent operation, so falling back to
> > Qemu in a longer path is not a real problem. If no specific
> > functionality or security reason for doing it in-kernel, I'm inclined
> > to agree with Zhi here (though not about complexity).

I agree GHCx BIND/UNBIND been routed to QEMU, cause there are host side
cross module managements for BIND/UNBIND. E.g. IOMMUFD page table
switching, VFIO side settings that builds host side TDI context & LOCK
TDI.

But I do support other GHCx calls between BIND/UNBIND been directly
route to TSM low-level. E.g. get device interface report, get device
certification/measurement, TDISP RUN. It is because these communications
are purely for CoCo-VM, firmware and TDI. Host is totally out of its
business and worth nothing to pass these requirements to QEMU/VFIO and
still back into TSM low-level.

Thanks,
Yilun

> > 
> > 
> 
> Exactly what I was thinking. Folks had been spending quite some efforts
> on keeping VFIO and KVM independent. The existing shortcut calling
> between two modules is there because there is no other better way to do
> it.
> 
> TSM BIND/UNBIND should not be a performance critical path. Thus falling
> back to QEMU would be fine. Besides, not sure about others' opinion, I
> don't think adding tsm_{bind, unbind} in kvm_x86_ops is a good idea.
> 
> If we have to stick to the current approach, I think we need more
> justifications.
>