On Fri, Mar 19, 2021 at 08:22:21PM +1300, Kai Huang wrote: > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > Add a misc device /dev/sgx_vepc to allow userspace to allocate "raw" EPC > without an associated enclave. The intended and only known use case for > raw EPC allocation is to expose EPC to a KVM guest, hence the 'vepc' > moniker, virt.{c,h} files and X86_SGX_KVM Kconfig. > > SGX driver uses misc device /dev/sgx_enclave to support userspace to > create enclave. Each file descriptor from opening /dev/sgx_enclave > represents an enclave. Unlike SGX driver, KVM doesn't control how guest > uses EPC, therefore EPC allocated to KVM guest is not associated to an > enclave, and /dev/sgx_enclave is not suitable for allocating EPC for KVM > guest. > > Having separate device nodes for SGX driver and KVM virtual EPC also > allows separate permission control for running host SGX enclaves and > KVM SGX guests. Hmm, just a question on the big picture here - that might've popped up already: So baremetal uses /dev/sgx_enclave and KVM uses /dev/sgx_vepc. Who's deciding which of the two has priority? Let's say all guests start using enclaves and baremetal cannot start any new ones anymore due to no more memory. Are we ok with that? What if baremetal creates a big fat enclave and starves guests all of a sudden. Are we ok with that either? In general, having two disjoint things give out SGX resources separately sounds like trouble to me. IOW, why don't all virt allocations go through /dev/sgx_enclave too, so that you can have a single place to control all resource allocations? > To use /dev/sgx_vepc to allocate a virtual EPC instance with particular > size, the userspace hypervisor opens /dev/sgx_vepc, and uses mmap() > with the intended size to get an address range of virtual EPC. Then > it may use the address range to create one KVM memory slot as virtual > EPC for guest. > > Implement the "raw" EPC allocation in the x86 core-SGX subsystem via > /dev/sgx_vepc rather than in KVM. Doing so has two major advantages: > > - Does not require changes to KVM's uAPI, e.g. EPC gets handled as > just another memory backend for guests. > > - EPC management is wholly contained in the SGX subsystem, e.g. SGX > does not have to export any symbols, changes to reclaim flows don't > need to be routed through KVM, SGX's dirty laundry doesn't have to > get aired out for the world to see, Good one. :-) > and so on and so forth. > The virtual EPC pages allocated to guests are currently not reclaimable. > Reclaiming EPC page used by enclave requires a special reclaim mechanism > separate from normal page reclaim, and that mechanism is not supported > for virutal EPC pages. Due to the complications of handling reclaim > conflicts between guest and host, reclaiming virtual EPC pages is > significantly more complex than basic support for SGX virtualization. What happens if someone in the future wants to change that? Someone needs to write patches or there's a more fundamental stopper issue involved? As always, I might be missing something but that doesn't stop me from being devil's advocate. :-) Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette