On Sat, Oct 03, 2020 at 07:50:46AM +0300, Jarkko Sakkinen wrote: > Intel Software Guard eXtensions (SGX) is a set of CPU instructions that can > be used by applications to set aside private regions of code and data. The > code outside the SGX hosted software entity is prevented from accessing the > memory inside the enclave by the CPU. We call these entities enclaves. > > Add a driver that provides an ioctl API to construct and run enclaves. > Enclaves are constructed from pages residing in reserved physical memory > areas. The contents of these pages can only be accessed when they are > mapped as part of an enclave, by a hardware thread running inside the > enclave. > > The starting state of an enclave consists of a fixed measured set of > pages that are copied to the EPC during the construction process by > using the opcode ENCLS leaf functions and Software Enclave Control > Structure (SECS) that defines the enclave properties. > > Enclaves are constructed by using ENCLS leaf functions ECREATE, EADD and > EINIT. ECREATE initializes SECS, EADD copies pages from system memory to > the EPC and EINIT checks a given signed measurement and moves the enclave > into a state ready for execution. > > An initialized enclave can only be accessed through special Thread Control > Structure (TCS) pages by using ENCLU (ring-3 only) leaf EENTER. This leaf > function converts a thread into enclave mode and continues the execution in > the offset defined by the TCS provided to EENTER. An enclave is exited > through syscall, exception, interrupts or by explicitly calling another > ENCLU leaf EEXIT. > > The mmap() permissions are capped by the contained enclave page > permissions. The mapped areas must also be populated, i.e. each page > address must contain a page. This logic is implemented in > sgx_encl_may_map(). > > Cc: linux-security-module@xxxxxxxxxxxxxxx > Cc: linux-mm@xxxxxxxxx > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > Acked-by: Jethro Beekman <jethro@xxxxxxxxxxxx> > Tested-by: Jethro Beekman <jethro@xxxxxxxxxxxx> > Tested-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx> > Tested-by: Chunyang Hui <sanqian.hcy@xxxxxxxxxx> > Tested-by: Jordan Hand <jorhand@xxxxxxxxxxxxxxxxxxx> > Tested-by: Nathaniel McCallum <npmccallum@xxxxxxxxxx> > Tested-by: Seth Moore <sethmo@xxxxxxxxxx> > Tested-by: Darren Kenny <darren.kenny@xxxxxxxxxx> > Reviewed-by: Darren Kenny <darren.kenny@xxxxxxxxxx> > Co-developed-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > Co-developed-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx> > Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx> > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> > --- > arch/x86/kernel/cpu/sgx/Makefile | 2 + > arch/x86/kernel/cpu/sgx/driver.c | 173 ++++++++++++++++ > arch/x86/kernel/cpu/sgx/driver.h | 29 +++ > arch/x86/kernel/cpu/sgx/encl.c | 331 +++++++++++++++++++++++++++++++ > arch/x86/kernel/cpu/sgx/encl.h | 85 ++++++++ > arch/x86/kernel/cpu/sgx/main.c | 11 + > 6 files changed, 631 insertions(+) > create mode 100644 arch/x86/kernel/cpu/sgx/driver.c > create mode 100644 arch/x86/kernel/cpu/sgx/driver.h > create mode 100644 arch/x86/kernel/cpu/sgx/encl.c > create mode 100644 arch/x86/kernel/cpu/sgx/encl.h > > diff --git a/arch/x86/kernel/cpu/sgx/Makefile b/arch/x86/kernel/cpu/sgx/Makefile > index 79510ce01b3b..3fc451120735 100644 > --- a/arch/x86/kernel/cpu/sgx/Makefile > +++ b/arch/x86/kernel/cpu/sgx/Makefile > @@ -1,2 +1,4 @@ > obj-y += \ > + driver.o \ > + encl.o \ > main.o > diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c > new file mode 100644 > index 000000000000..f54da5f19c2b > --- /dev/null > +++ b/arch/x86/kernel/cpu/sgx/driver.c > @@ -0,0 +1,173 @@ > +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) You use gpl-only header files in this file, so how in the world can it be bsd-3 licensed? Please get your legal department to agree with this, after you explain to them how you are mixing gpl2-only code in with this file. > +// Copyright(c) 2016-18 Intel Corporation. Dates are hard to get right :( > + > +#include <linux/acpi.h> > +#include <linux/miscdevice.h> > +#include <linux/mman.h> > +#include <linux/security.h> > +#include <linux/suspend.h> > +#include <asm/traps.h> > +#include "driver.h" > +#include "encl.h" > + > +u64 sgx_encl_size_max_32; > +u64 sgx_encl_size_max_64; > +u32 sgx_misc_reserved_mask; > +u64 sgx_attributes_reserved_mask; > +u64 sgx_xfrm_reserved_mask = ~0x3; > +u32 sgx_xsave_size_tbl[64]; > + > +static int sgx_open(struct inode *inode, struct file *file) > +{ > + struct sgx_encl *encl; > + int ret; > + > + encl = kzalloc(sizeof(*encl), GFP_KERNEL); > + if (!encl) > + return -ENOMEM; > + > + atomic_set(&encl->flags, 0); > + kref_init(&encl->refcount); > + xa_init(&encl->page_array); > + mutex_init(&encl->lock); > + INIT_LIST_HEAD(&encl->mm_list); > + spin_lock_init(&encl->mm_lock); > + > + ret = init_srcu_struct(&encl->srcu); > + if (ret) { > + kfree(encl); > + return ret; > + } > + > + file->private_data = encl; > + > + return 0; > +} > + > +static int sgx_release(struct inode *inode, struct file *file) > +{ > + struct sgx_encl *encl = file->private_data; > + struct sgx_encl_mm *encl_mm; > + > + for ( ; ; ) { > + spin_lock(&encl->mm_lock); > + > + if (list_empty(&encl->mm_list)) { > + encl_mm = NULL; > + } else { > + encl_mm = list_first_entry(&encl->mm_list, > + struct sgx_encl_mm, list); > + list_del_rcu(&encl_mm->list); > + } > + > + spin_unlock(&encl->mm_lock); > + > + /* The list is empty, ready to go. */ > + if (!encl_mm) > + break; > + > + synchronize_srcu(&encl->srcu); > + mmu_notifier_unregister(&encl_mm->mmu_notifier, encl_mm->mm); > + kfree(encl_mm); > + } > + > + mutex_lock(&encl->lock); > + atomic_or(SGX_ENCL_DEAD, &encl->flags); So you set a flag that this is dead, and then instantly delete it? Why does that matter? I see you check for this flag elsewhere, but as you are just about to delete this structure, how can this be an issue? > + mutex_unlock(&encl->lock); > + > + kref_put(&encl->refcount, sgx_encl_release); Don't you need to hold the lock across the put? If not, what is serializing this? But an even larger comment, why is this reference count needed at all? You never grab it except at init time, and you free it at close time. Why not rely on the reference counting that the vfs ensures you? > + return 0; > +} > + > +static int sgx_mmap(struct file *file, struct vm_area_struct *vma) > +{ > + struct sgx_encl *encl = file->private_data; > + int ret; > + > + ret = sgx_encl_may_map(encl, vma->vm_start, vma->vm_end, vma->vm_flags); > + if (ret) > + return ret; > + > + ret = sgx_encl_mm_add(encl, vma->vm_mm); > + if (ret) > + return ret; > + > + vma->vm_ops = &sgx_vm_ops; > + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO; > + vma->vm_private_data = encl; > + > + return 0; > +} > + > +static unsigned long sgx_get_unmapped_area(struct file *file, > + unsigned long addr, > + unsigned long len, > + unsigned long pgoff, > + unsigned long flags) > +{ > + if ((flags & MAP_TYPE) == MAP_PRIVATE) > + return -EINVAL; > + > + if (flags & MAP_FIXED) > + return addr; > + > + return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); > +} > + > +static const struct file_operations sgx_encl_fops = { > + .owner = THIS_MODULE, > + .open = sgx_open, > + .release = sgx_release, > + .mmap = sgx_mmap, > + .get_unmapped_area = sgx_get_unmapped_area, > +}; > + > +static struct miscdevice sgx_dev_enclave = { > + .minor = MISC_DYNAMIC_MINOR, > + .name = "enclave", > + .nodename = "sgx/enclave", A subdir for a single device node? Ok, odd, but why not just "sgx_enclave"? How "special" is this device node? thanks, greg k-h