On Wed, Jan 02, 2019 at 12:47:52PM -0800, Sean Christopherson wrote: > On Sat, Dec 22, 2018 at 10:25:02AM +0200, Jarkko Sakkinen wrote: > > On Sat, Dec 22, 2018 at 10:16:49AM +0200, Jarkko Sakkinen wrote: > > > On Thu, Dec 20, 2018 at 12:32:04PM +0200, Jarkko Sakkinen wrote: > > > > On Wed, Dec 19, 2018 at 06:58:48PM -0800, Andy Lutomirski wrote: > > > > > Can one of you explain why SGX_ENCLAVE_CREATE is better than just > > > > > opening a new instance of /dev/sgx for each encalve? > > > > > > > > I think that fits better to the SCM_RIGHTS scenario i.e. you could send > > > > the enclav to a process that does not have necessarily have rights to > > > > /dev/sgx. Gives more robust environment to configure SGX. > > > > > > Sean, is this why you wanted enclave fd and anon inode and not just use > > > the address space of /dev/sgx? Just taking notes of all observations. > > > I'm not sure what your rationale was (maybe it was somewhere). This was > > > something I made up, and this one is wrong deduction. You can easily > > > get the same benefit with /dev/sgx associated fd representing the > > > enclave. > > > > > > This all means that for v19 I'm going without enclave fd involved with > > > fd to /dev/sgx representing the enclave. No anon inodes will be > > > involved. > > > > Based on these observations I updated the uapi. > > > > As far as I'm concerned there has to be a solution to do EPC mapping > > with a sequence: > > > > 1. Ping /dev/kvm to do something. > > 2. KVM asks SGX core to do something. > > 3. SGX core does something. > > > > I don't care what the something is exactly is, but KVM is the only sane > > place for KVM uapi. I would be surprised if KVM maintainers didn't agree > > that they don't want to sprinkle KVM uapi to random places in other > > subsystems. > > It's not a KVM uapi. > > KVM isn't a hypervisor in the traditional sense. The "real" hypervisor > lives in userspace, e.g. Qemu, KVM is essentially just a (very fancy) > driver for hardware accelerators, e.g. VMX. Qemu for example is fully > capable of running an x86 VM without KVM, it's just substantially slower. > > In terms of guest memory, KVM doesn't care or even know what a particular > region of memory represents or what, if anything, is backing a region in > the host. There are cases when KVM is made aware of certain aspects of > guest memory for performance or functional reasons, e.g. emulated MMIO > and encrypted memory, but in all cases the control logic ultimately > resides in userspace. > > SGX is a weird case because ENCLS can't be emulated in software, i.e. > exposing SGX to a VM without KVM's help would be difficult. But, it > wouldn't be impossible, just slow and ugly. > > And so, ignoring host oversubscription for the moment, there is no hard > requirement that SGX EPC can only be exposed to a VM through KVM. In > other words, allocating and exposing EPC to a VM is orthogonal to KVM > supporting SGX. Exposing EPC to userspace via /dev/sgx/epc would mean > that KVM would handle it like any other guest memory region, and all EPC > related code/logic would reside in the SGX subsystem. I'm fine doing that if it makes sense. I just don't understand why you cannot add ioctls to /dev/kvm for allocating the region. Why isn't that possible? As I said to Andy earlier, adding new device files is easy as everything related to device creation is nicely encapsulated. > Oversubscription throws a wrench in the system because ENCLV can only > be executed post-VMXON and EPC conflicts generate VMX VM-Exits. But > even then, KVM doesn't need to own the EPC uapi, e.g. it can call into > the SGX subsystem to handle EPC conflict VM-Exits and the SGX subsystem > can wrap ENCLV with exception fixup and forcefully reclaim EPC pages if > ENCLV faults. If the uapi is *only* for KVM, it should definitely own it. KVM calling SGX subsystem on a conflict is KVM using in-kernel APIs provided by the SGX core. > I can't be 100% certain the oversubscription scheme will be sane without > actually writing the code, but I'd like to at least keep the option open, > i.e. not structure /dev/sgx/ in such a way that adding e.g. /dev/sgx/epc > is impossible or ugly. /Jarkko