On Fri, Nov 19, 2021 at 10:21:39PM +0000, Sean Christopherson wrote: > On Fri, Nov 19, 2021, Jason Gunthorpe wrote: > > On Fri, Nov 19, 2021 at 07:18:00PM +0000, Sean Christopherson wrote: > > > No ideas for the kernel API, but that's also less concerning since > > > it's not set in stone. I'm also not sure that dedicated APIs for > > > each high-ish level use case would be a bad thing, as the semantics > > > are unlikely to be different to some extent. E.g. for the KVM use > > > case, there can be at most one guest associated with the fd, but > > > there can be any number of VFIO devices attached to the fd. > > > > Even the kvm thing is not a hard restriction when you take away > > confidential compute. > > > > Why can't we have multiple KVMs linked to the same FD if the memory > > isn't encrypted? Sure it isn't actually useful but it should work > > fine. > > Hmm, true, but I want the KVM semantics to be 1:1 even if memory > isn't encrypted. That is policy and it doesn't belong hardwired into the kernel. Your explanation makes me think that the F_SEAL_XX isn't defined properly. It should be a userspace trap door to prevent any new external accesses, including establishing new kvms, iommu's, rdmas, mmaps, read/write, etc. > It's not just avoiding the linked list, there's a trust element as > well. E.g. in the scenario where a device can access a confidential > VM's encrypted private memory, the guest is still the "owner" of the > memory and needs to explicitly grant access to a third party, > e.g. the device or perhaps another VM. Authorization is some other issue - the internal kAPI should be able to indicate it is secured memory and the API user should do whatever dance to gain access to it. Eg for VFIO ask the realm manager to associate the pci_device with the owner realm. Jason