RE: [RFC PATCH 0/9] security: x86/sgx: SGX vs. LSM

"Xing, Cedric" <cedric.xing@xxxxxxxxx> · Mon, 3 Jun 2019 18:30:54 +0000

> From: Christopherson, Sean J
> Sent: Monday, June 03, 2019 10:16 AM
> 
> On Sun, Jun 02, 2019 at 12:29:35AM -0700, Xing, Cedric wrote:
> > Hi Sean,
> >
> > Generally I agree with your direction but think ALLOW_* flags are
> > completely internal to LSM because they can be both produced and
> > consumed inside an LSM module. So spilling them into SGX driver and
> > also user mode code makes the solution ugly and in some cases
> > impractical because not every enclave host process has a priori
> > knowledge on whether or not an enclave page would be EMODPE'd at
> runtime.
> 
> In this case, the host process should tag *all* pages it *might* convert
> to executable as ALLOW_EXEC.  LSMs can (and should/will) be written in
> such a way that denying ALLOW_EXEC is fatal to the enclave if and only
> if the enclave actually attempts mprotect(PROT_EXEC).

What if those pages contain self-modifying code but the host doesn't know ahead of time? Would it require ALLOW_WRITE|ALLOW_EXEC at EADD? Then would it prevent those pages to start with PROT_EXEC?

Anyway, my point is that it is unnecessary even if it works.

> 
> Take the SELinux path for example.  The only scenario in which
> PROT_WRITE is cleared from @allowed_prot is if the page *starts* with
> PROT_EXEC.
> If PROT_EXEC is denied on a page that starts RW, e.g. an EAUG'd page,
> then PROT_EXEC will be cleared from @allowed_prot.
> 
> As Stephen pointed out, auditing the denials on @allowed_prot means the
> log will contain false positives of a sort.  But this is more of a noise
> issue than true false positives.  E.g. there are three possible outcomes
> for the enclave.
> 
>   - The enclave does not do EMODPE[PROT_EXEC] in any scenario, ever.
>     Requesting ALLOW_EXEC is either a straightforward a userspace bug or
>     a poorly written generic enclave loader.
> 
>   - The enclave conditionally performs EMODPE[PROT_EXEC].  In this case
>     the denial is a true false positive.
> 
>   - The enclave does EMODPE[PROT_EXEC] and its host userspace then fails
>     on mprotect(PROT_EXEC), i.e. the LSM denial is working as intended.
>     The audit log will be noisy, but viewed as a whole the denials
> aren't
>     false positives.

What I was talking about was EMODPE[PROT_WRITE] on an RX page.

> 
> The potential for noisy audit logs and/or false positives is unfortunate,
> but it's (by far) the lesser of many evils.
> 
> > Theoretically speaking, what you really need is a per page flag (let's
> > name it WRITTEN?) indicating whether a page has ever been written to
> > (or more precisely, granted PROT_WRITE), which will be used to decide
> > whether to grant PROT_EXEC when requested in future. Given the fact
> > that all mprotect() goes through LSM and mmap() is limited to
> > PROT_NONE, it's easy for LSM to capture that flag by itself instead of
> asking user mode code to provide it.
> >
> > That said, here is the summary of what I think is a better approach.
> > * In hook security_file_alloc(), if @file is an enclave, allocate some
> data
> >   structure to store for every page, the WRITTEN flag as described
> above.
> >   WRITTEN is cleared initially for all pages.
> 
> This would effectively require *every* LSM to duplicate the SGX driver's
> functionality, e.g. track per-page metadata, implement locking to
> prevent races between multiple mm structs, etc...

Architecturally we shouldn't dictate how LSM makes decisions. ALLOW_* are no difference than PROCESS__* or FILE__* flags, which are just artifacts to assist particular LSMs in decision making. They are never considered part of the LSM interface, even if other LSMs than SELinux may adopt the same/similar approach.

If code duplication is what you are worrying about, you can put them in a library, or implement/export them in some new file (maybe security/enclave.c?) as utility functions. But spilling them into user mode is what I think is unacceptable.

> 
> >   Open: Given a file of type struct file *, how to tell if it is an
> enclave (i.e. /dev/sgx/enclave)?
> > * In hook security_mmap_file(), if @file is an enclave, make sure
> @prot can
> >   only be PROT_NONE. This is to force all protection changes to go
> through
> >   security_file_mprotect().
> > * In the newly introduced hook security_enclave_load(), set WRITTEN
> for pages
> >   that are requested PROT_WRITE.
> 
> How would an LSM associate a page with a specific enclave?  vma->vm_file
> will point always point at /dev/sgx/enclave.  vma->vm_mm is useless
> because we're allowing multiple processes to map a single enclave, not
> to mention that by mm would require holding a reference to the mm.

Each open("/dev/sgx/enclave") syscall creates a *new* instance of struct file to uniquely identify one enclave instance. What I mean is @vma->vm_file, not @vma->vm_file->f_path or @vma->vm_file->f_inode.

> 
> > * In hook security_file_mprotect(), if @vma->vm_file is an enclave,
> look up
> >   and use WRITTEN flags for all pages within @vma, along with other
> global
> >   flags (e.g. PROCESS__EXECMEM/FILE__EXECMOD in the case of SELinux)
> to decide
> >   on allowing/rejecting @prot.
> 
> vma->vm_file will always be /dev/sgx/enclave at this point, which means
> LSMs don't have the necessary anchor back to the source file, e.g. to
> enforce FILE__EXECUTE.  The noexec file system case is also unaddressed.

vma->vm_file identifies an enclave instance uniquely. FILE__EXECUTE is checked by security_enclave_load() using @source_vma->vm_file. Once a page has been EADD'ed, whether to allow RW->RX depends on .sigstruct file (more precisely, the file backing SIGSTRUCT), whose FILE__* attributes could be cached in vma->vm_file->f_security by security_enclave_init().

The noexec case should be addressed in IOC_ADD_PAGES by testing @source_vma->vm_flags & VM_MAYEXEC.

> 
> > * In hook security_file_free(), if @file is an  enclave, free storage
> >   allocated for WRITTEN flags.