Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

Sean Christopherson <sean.j.christopherson@xxxxxxxxx> · Tue, 18 Dec 2018 10:53:49 -0800

On Tue, Dec 18, 2018 at 07:44:18AM -0800, Sean Christopherson wrote:
> On Mon, Dec 17, 2018 at 08:59:54PM -0800, Andy Lutomirski wrote:
> > On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx> wrote:
> > >
> > 
> > > My brain is still sorting out the details, but I generally like the idea
> > > of allocating an anon inode when creating an enclave, and exposing the
> > > other ioctls() via the returned fd.  This is essentially the approach
> > > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> > > and vCPUS.  There are even similarities to accessing physical memory via
> > > multiple disparate domains, e.g. host kernel, host userspace and guest.
> > >
> > 
> > In my mind, opening /dev/sgx would give you the requisite inode.  I'm
> > not 100% sure that the chardev infrastructure allows this, but I think
> > it does.
> 
> My fd/inode knowledge is lacking, to say the least.  Whatever works, so
> long as we have a way to uniquely identify enclaves.

Actually, while we're dissecting the interface...

What if we re-organize the ioctls in such a way that we leave open the
possibility of allocating raw EPC for KVM via /dev/sgx?  I'm not 100%
positive this approach will work[1], but conceptually it fits well with
KVM's memory model, e.g. KVM is aware of the GPA<->HVA association but
generally speaking doesn't know what's physically backing each memory
region.

Tangentially related, I think we should support allocating multiple
enclaves from a single /dev/sgx fd, i.e. a process shouldn't have to
open /dev/sgx every time it wants to create a new enclave.

Something like this:

/dev/sgx
  |
  -> mmap() { return -EINVAL; }
  |
  -> unlocked_ioctl()
     |
     -> SGX_CREATE_ENCLAVE: { return alloc_enclave_fd(); }
     |  |
     |   -> mmap() { ... }
     |  | 
     |   -> get_unmapped_area() { 
     |  |           if (enclave->size) {
     |  |                   if (!addr)
     |  |                           addr = enclave->base;
     |  |                   if (addr + len + pgoff > enclave->base + enclave->size)
     |  |                           return -EINVAL;
     |  |           } else {
     |  |                   if (!validate_size(len))
     |  |                           return -EINVAL;
     |  |                   addr = naturally_align(len);
     |  |           }
     |  |   }
     |  |
     |   -> unlocked_ioctl() {
     |              SGX_ENCLAVE_ADD_PAGE: { ... }
     |              SGX_ENCLAVE_INIT: { ... }
     |              SGX_ENCLAVE_REMOVE_PAGES: { ... }
     |              SGX_ENCLAVE_MODIFY_PAGES: { ... }
     |      }
     |
     -> SGX_CREATE_VIRTUAL_EPC: {return alloc_epc_fd(); }
        |
         -> mmap() { ... }
        |
	 -> get_unmapped_area() {<page aligned/sized> }
        |
         -> unlocked_ioctl() {
                    SGX_VIRTUAL_EPC_???:
		    SGX_VIRTUAL_EPC_???:
	    }

[1] Delegating EPC management to /dev/sgx is viable for virtualizing SGX
    without oversubscribing EPC to guests, but oversubscribing EPC in a
    VMM requires handling EPC-related VM-Exits and using instructions
    that will #UD if the CPU is not post-VMXON.  I *think* having KVM
    forward VM-Exits to x86/sgx would work, but it's entirely possible
    it'd be a complete cluster.