On Fri, Sep 18, 2020 at 04:53:37PM -0700, Sean Christopherson wrote: > On Fri, Sep 18, 2020 at 08:09:04AM -0700, Andy Lutomirski wrote: > > On Tue, Sep 15, 2020 at 4:28 AM Jarkko Sakkinen > > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > > > > > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > > > > Add vm_ops()->mprotect() for additional constraints for a VMA. > > > > > > Intel Software Guard eXtensions (SGX) will use this callback to add two > > > constraints: > > > > > > 1. Verify that the address range does not have holes: each page address > > > must be filled with an enclave page. > > > 2. Verify that VMA permissions won't surpass the permissions of any enclave > > > page within the address range. Enclave cryptographically sealed > > > permissions for each page address that set the upper limit for possible > > > VMA permissions. Not respecting this can cause #GP's to be emitted. > > Side note, #GP is wrong. EPCM violations are #PFs. Skylake CPUs #GP, but > that's technically an errata. But this isn't the real motivation, e.g. > userspace can already trigger #GP/#PF by reading/writing a bad address, SGX > simply adds another flavor. > > > It's been awhile since I looked at this. Can you remind us: is this > > just preventing userspace from shooting itself in the foot or is this > > something more important? > > Something more important, it's used to prevent userspace from circumventing > a noexec filesystem by loading code into an enclave, and to give the kernel the > option of adding enclave specific LSM policies in the future. > > The source file (if one exists) for the enclave is long gone when the enclave > is actually mmap()'d and mprotect()'d. To enforce noexec, the requested > permissions for a given page are snapshotted when the page is added to the > enclave, i.e. when the enclave is built. Enclave pages that will be executable > must originate from an a MAYEXEC VMA, e.g. the source page can't come from a > noexec file system. noexec check is done in __sgx_encl_add_page(), not in this callback. sgx_vma_mprotect() calls sgx_encl_may_map(), which iterates the addresses, checks that permissions are not surpassed and there are no holes. > The ->mprotect() hook allows SGX to reject mprotect() if userspace is declaring > permissions beyond what are allowed, e.g. trying to map an enclave page with > EXEC permissions when the page was added to the enclave without EXEC. For my ear this is tautological because if user space would use differing permissions, it would exactly soot itself in the foot. What really should be documented is to answer why we consider an enclave mappable, if and only if the conditions are that I described about sgx_encl_may_map() functioning. This is obviously part of the answer [*], i.e. with pharsing LSM hook requires an enclave to have a state, which is compatible with the mmap permissions and is fully populated. The 2nd part of the answer is the answer to the question: why we want to feed LSM hooks enclaves exactly in this state. What would you answer to that? [*] https://lore.kernel.org/linux-sgx/20200918232458.GA6175@xxxxxxxxxxxxxxx/T/#u /Jarkko