Hi, LSM and SELinux people- We're trying to figure out how SGX fits in with LSMs. For background, an SGX library is functionally a bit like a DSO, except that it's nominally resistant to attack from outside and the process of loading it is complicated. To load an enclave, a program can open /dev/sgx/enclave, do some ioctls to load the code and data segments into the enclave, call a special ioctl to "initialize" the enclave, and then call into the enclave (using special CPU instructions). One nastiness is that there is not actually a universally agreed upon, documented file format for enclaves. Windows has an undocumented format, and there are probably a few others out there. No one really wants to teach the kernel to parse enclave files. There are two issues with how this interacts with LSMs: 1) LSMs might want to be able to whitelist, blacklist, or otherwise restrict what enclaves can run at all. The current proposal that everyone seems to dislike the least is to have a .sigstruct file on disk that contains a hash and signature of the enclave in a CPU-defined format. To initialize an enclave, a program will pass an fd to this file, and a new LSM hook can be called to allow or disallow the operation. In a SELinux context, the idea is that policy could require the .sigstruct file to be labeled with a type like sgx_sigstruct_t, and only enclaves that have a matching .sigstruct with such a label could run. 2) Just like any other DSO, there are potential issues with how enclaves deal with writable vs executable memory. This takes two forms. First, a task should probably require EXECMEM, EXECMOD, or similar permission to run an enclave that can modify its own text. Second, it would be nice if a task that did *not* have EXECMEM, EXECMOD, or similar could still run the enclave if it had EXECUTE permission on the file containing the enclave. Currently, this all works because DSOs are run by mmapping the file to create multiple VMAs, some of which are executable, non-writable, and non-CoWed, and some of which are writable but not executable. With SGX, there's only really one inode per enclave (the anon_inode that comes form /dev/sgx/enclave), and it can only be sensibly mapped MAP_SHARED. With the current version of the SGX driver, to run an enclave, I think you'll need either EXECUTE rights to /dev/sgx/enclave or EXECMOD or similar, all of which more or less mean that you can run any modified code you want, and none of which is useful to prevent enclaves from contain RWX segments. So my question is: what, if anything, should change to make this work better? Here's a very vague proposal that's kind of like what I've been thinking over the past few days. The SGX inode could track, for each page, a "safe-to-execute" bit. When you first open /dev/sgx/enclave, you get a blank enclave and all pages are safe-to-execute. When you do the ioctl to load context (which could be code, data, or anything else), the kernel will check whether the *source* VMA is executable and, if not, mark the page of the enclave being loaded as unsafe. Once the enclave is initialized, the driver will clear the safe-to-execute bit for any page that is successfully mapped writably. The intent is that a page of the enclave is safe-to-execute if that page was populated from executable memory and not modified since then. LSMs could then enforce a policy that you can map an enclave page RX if the page is safe-to-execute, you can map any page you want for write if there are no executable mappings, and you can only map a page for write and execute simultaneously if you can EXECMOD permission. This should allow an enclave to be loaded by userspace from a file with EXECUTE rights. So here are my questions: Are the goals I mentioned reasonable? Is the design I just outlined reasonable? Would SELinux support this? Is there a better solution that works well enough? Thanks, all! > On May 14, 2019, at 6:30 PM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > > >> But thinking this all through, it's a bit more complicated than any of >> this. Looking at the SELinux code for inspiration, there are quite a >> few paths, but they boil down to two cases: EXECUTE is the right to >> map an unmodified file executably, and EXECMOD/EXECMEM (the >> distinction seems mostly irrelevant) is the right to create (via mmap >> or mprotect) a modified anonymous file mapping or a non-file-backed >> mapping that is executable. So, if we do nothing, then mapping an >> enclave with execute permission will require either EXECUTE on the >> enclave inode or EXECMOD/EXECMEM, depending on exactly how this gets >> set up. > > If we do literally nothing, then I'm pretty sure mapping an enclave will > require PROCESS__EXECMEM. The mmap() for the actual enclave is done > using an anon inode, e.g. from /dev/sgx/enclave. Anon inodes are marked > private, which means inode_has_perm() will always return "success". The > only effective check is in file_map_prot_check() when default_noexec is > true, in which case requesting PROT_EXEC on private inodes requires > PROCESS__EXECMEM. > >> So all is well, sort of. The problem is that I expect there will be >> people who want enclaves to work in a process that does not have these >> rights. To make this work, we probably need do some surgery on >> SELinux. ISTM the act of copying (via the EADD ioctl) data from a >> PROT_EXEC mapping to an enclave should not be construed as "modifying" >> the enclave for SELinux purposes. Actually doing this could be >> awkward, since the same inode will have executable parts and >> non-executable parts, and SELinux can't really tell the difference. > > Rather the do surgery on SELinux, why not go with Cedric's original > proposal and propagate the permissions from the source VMA to the EPC > VMA? Which EPC VMA? Users can map the enclave fd again after EADD, resulting in a new VMA. And any realistic enclave will presumably have RO, RW, and RX pages. > The enclave mmap() from userspace could then be done with RO > permissions so as to not run afoul of LSMs. Adding PROT_EXEC after > EADD would require PROCESS__EXECMEM, but that's in line with mprotect() > on regular memory. How does this help anything? The driver currently only works with EXECMEM and, with this change, it still needs EXECMEM. I think that, if we’re going to make changes along these lines, the goal should be that you can have an enclave serialized in a file on disk such that you have EXECUTE on the file, and you should be able to load and run the enclave without needing EXECMEM. (Unless the enclave is self-modifying, of course.)