On Thu, May 16, 2019 at 05:26:15PM -0700, Andy Lutomirski wrote: > On Thu, May 16, 2019 at 5:03 PM Sean Christopherson > <sean.j.christopherson@xxxxxxxxx> wrote: > > > > On Wed, May 15, 2019 at 11:27:04AM -0700, Andy Lutomirski wrote: > > > Here's a very vague proposal that's kind of like what I've been > > > thinking over the past few days. The SGX inode could track, for each > > > page, a "safe-to-execute" bit. When you first open /dev/sgx/enclave, > > > you get a blank enclave and all pages are safe-to-execute. When you > > > do the ioctl to load context (which could be code, data, or anything > > > else), the kernel will check whether the *source* VMA is executable > > > and, if not, mark the page of the enclave being loaded as unsafe. > > > Once the enclave is initialized, the driver will clear the > > > safe-to-execute bit for any page that is successfully mapped writably. > > > > > > The intent is that a page of the enclave is safe-to-execute if that > > > page was populated from executable memory and not modified since then. > > > LSMs could then enforce a policy that you can map an enclave page RX > > > if the page is safe-to-execute, you can map any page you want for > > > write if there are no executable mappings, and you can only map a page > > > for write and execute simultaneously if you can EXECMOD permission. > > > This should allow an enclave to be loaded by userspace from a file > > > with EXECUTE rights. > > > > I'm still confused as to why you want to track execute permissions on the > > enclave pages and add SGX-specific LSM hooks. Is there anything that > > prevents userspace from building the enclave like any other DSO and then > > copying it into enclave memory? > > It's entirely possible that I'm the one missing something. But here's > why I think this: > > > I feel like I'm missing something. > > > > 1. Userspace loads enclave into regular memory, e.g. like a normal DSO. > > All mmap(), mprotect(), etc... calls are subject to all existing > > LSM policies. > > > > 2. Userspace opens /dev/sgx/enclave to instantiate a new enclave. > > > > 3. Userspace uses mmap() to allocate virtual memory for its enclave, > > again subject to all existing LSM policies (sane userspaces map it RO > > since the permissions eventually get tossed anyways). > > Is userspace actually requred to mmap() the enclave prior to EADDing things? It was a requirement prior to the API rework in v20, i.e. unless someone was really quick on the draw after the v20 update all existing userspace implementations mmap() the enclave before ECREATE. Requiring a valid enclave VMA for EADD shoudn't be too onerous. > > 4. SGX subsystem refuses to service page faults for enclaves that have > > not yet been initialized, e.g. signals SIGBUS or SIGSEGV. > > > > 5. Userspace invokes SGX ioctl() to copy enclave from regulary VMA to > > enclave VMA. > > > > 6. SGX ioctl() propagates VMA protection-related flags from source VMA > > to enclave VMA, e.g. invokes mprotect_fixup(). Enclave VMA(s) may > > be split as part of this process. > > Does this also call the LSM? If so, what is it expected to do? Nope. My reasoning behind skipping LSM checks is that the LSMs have already ok'd the source VMAs, similar to how dup_mmap() doesn't redo LSM checks. > What happens if there are different regions with different permissions on > the same page? SGX has 256-byte granularity right? No, EPC pages have 4k granularity. The EPC is divided into EPC pages. An EPC page is 4KB in size and always aligned on a 4KB boundary EEXTEND is the only aspect of SGX that works on 256-byte chunks, and that goofiness is primarily to keep the latency of EEXTEND low enough so that the instruction doesn't have to be interruptible, a la EINIT. > > > > 7. At all times, mprotect() calls on the enclave VMA are subject to > > existing LSM policies, i.e. it's not special cased for enclaves. > > I don't think the normal behavior actually works here. An enclave is > always MAP_SHARED, so (with SELinux) mprotecting() to X or RX requires > EXECUTE and mprotecting() to RWX requires extra permissions. Requiring extra permissions is good though, right? My thinking is to make the EADD "VMA copy" the happy/easy path, while using mprotect() to convert EPC memory to executable would require PROCESS__EXECMEM (assuming we back enclaves with anon inodes instead of /dev/sgx/enclave). > But user code can also mmap() the enclave again. What is supposed to > happen in that case? Hmm, it can't effectively re-mmap() the enclave as executable since entering the enclave requires using the correct virtual address range, i.e. EENTER would fail. It could, I think, do munmap()->mmap() to change the permissions. We could handle that case fairly easily by invoking security_file_mprotect() in SGX's mmap() hook if any pages have been added to the enclave, i.e. treat mmap() like mprotect().