On Thu, May 23, 2019 at 08:38:17AM -0700, Andy Lutomirski wrote: > On Thu, May 23, 2019 at 7:17 AM Sean Christopherson > <sean.j.christopherson@xxxxxxxxx> wrote: > > > > On Thu, May 23, 2019 at 01:26:28PM +0300, Jarkko Sakkinen wrote: > > > On Wed, May 22, 2019 at 07:35:17PM -0700, Sean Christopherson wrote: > > > > But actually, there's no need to disallow mmap() after ECREATE since the > > > > LSM checks also apply to mmap(), e.g. FILE__EXECUTE would be needed to > > > > mmap() any enclave pages PROT_EXEC. I guess my past self thought mmap() > > > > bypassed LSM checks? The real problem is that mmap()'ng an existing > > > > enclave would require FILE__WRITE and FILE__EXECUTE, which puts us back > > > > at square one. > > > > > > I'm lost with the constraints we want to set. > > > > As is today, SELinux policies would require enclave loaders to have > > FILE__WRITE and FILE__EXECUTE permissions on /dev/sgx/enclave. Presumably > > other LSMs have similar requirements. Requiring all processes to have > > FILE__{WRITE,EXECUTE} permissions means the permissions don't add much > > value, e.g. they can't be used to distinguish between an enclave that is > > being loaded from an unmodified file and an enclave that is being > > generated on the fly, e.g. Graphene. > > > > Looking back at Andy's mail, he was talking about requiring FILE__EXECUTE > > to run an enclave, so perhaps it's only FILE__WRITE that we're trying to > > special case. > > > > I thought about this some more, and I have a new proposal that helps > address the ELRANGE alignment issue and the permission issue at the > cost of some extra verbosity. Maybe you all can poke holes in it :) > The basic idea is to make everything more explicit from a user's > perspective. Here's how it works: > > Opening /dev/sgx/enclave gives an enclave_fd that, by design, doesn't > give EXECUTE or WRITE. mmap() on the enclave_fd only works if you > pass PROT_NONE and gives the correct alignment. The resulting VMA > cannot be mprotected or mremapped. It can't be mmapped at all until > after ECREATE because the alignment isn't known before that. How to deny mprotect()? struct file_operations does not have callback for that (AFAIK). > Associated with the enclave are a bunch (up to 7) "enclave segment > inodes". These are anon_inodes that are created automagically. An > enclave segment is a group of pages, not necessary contiguous, with an > upper bound on the memory permissions. Each enclave page belongs to a > segment. When you do EADD, you tell the driver what segment you're > adding to. [0] This means that EADD gets an extra argument that is a > permission mask for the page -- in addition to the initial SECINFO, > you also pass to EADD something to the effect of "I promise never to > map this with permissions greater than RX". > > Then we just need some way to mmap a region from an enclave segment. > This could be done by having a way to get an fd for an enclave segment > or it could be done by having a new ioctl SGX_IOC_MAP_SEGMENT. User > code would use this operation to replace, MAP_FIXED-style, ranges from > the big PROT_NONE mapping with the relevant pages from the enclave > segment. The resulting vma would only have VM_MAYWRITE if the segment > is W, only have VM_MAYEXEC if the segment is X, and only have > VM_MAYREAD if the segment is R. Depending on implementation details, > the VMAs might need to restrict mremap() to avoid mapping pages that > aren't part of the segment in question. > > It's plausible that this whole thing works without the magic segment > inodes under the hood, but figuring that out would need a careful look > at how all the core mm bits and LSM bits work together. > > To get all the LSM stuff to work, SELinux will need some way to > automatically assign an appropriate label to the segment inodes. I > assume that such a mechanism already exists and gets used for things > like sockets, but I haven't actually confirmed this. > > [0] There needs to be some vaguely intelligent semantics if you EADD > the *same* address more than once. A simple solution would be to > disallow it if the segments don't match. What if instead simply: - Require to do PROT_NONE mmap() for the ELRANGE before ECREATE. - Disallow mprotect() up until EINIT. - Given that we have a callback for mprotect() check that permissions match EADD'd permissions. /Jarkko