> From: Christopherson, Sean J > Sent: Monday, June 03, 2019 10:16 AM > > On Sun, Jun 02, 2019 at 12:29:35AM -0700, Xing, Cedric wrote: > > Hi Sean, > > > > Generally I agree with your direction but think ALLOW_* flags are > > completely internal to LSM because they can be both produced and > > consumed inside an LSM module. So spilling them into SGX driver and > > also user mode code makes the solution ugly and in some cases > > impractical because not every enclave host process has a priori > > knowledge on whether or not an enclave page would be EMODPE'd at > runtime. > > In this case, the host process should tag *all* pages it *might* convert > to executable as ALLOW_EXEC. LSMs can (and should/will) be written in > such a way that denying ALLOW_EXEC is fatal to the enclave if and only > if the enclave actually attempts mprotect(PROT_EXEC). What if those pages contain self-modifying code but the host doesn't know ahead of time? Would it require ALLOW_WRITE|ALLOW_EXEC at EADD? Then would it prevent those pages to start with PROT_EXEC? Anyway, my point is that it is unnecessary even if it works. > > Take the SELinux path for example. The only scenario in which > PROT_WRITE is cleared from @allowed_prot is if the page *starts* with > PROT_EXEC. > If PROT_EXEC is denied on a page that starts RW, e.g. an EAUG'd page, > then PROT_EXEC will be cleared from @allowed_prot. > > As Stephen pointed out, auditing the denials on @allowed_prot means the > log will contain false positives of a sort. But this is more of a noise > issue than true false positives. E.g. there are three possible outcomes > for the enclave. > > - The enclave does not do EMODPE[PROT_EXEC] in any scenario, ever. > Requesting ALLOW_EXEC is either a straightforward a userspace bug or > a poorly written generic enclave loader. > > - The enclave conditionally performs EMODPE[PROT_EXEC]. In this case > the denial is a true false positive. > > - The enclave does EMODPE[PROT_EXEC] and its host userspace then fails > on mprotect(PROT_EXEC), i.e. the LSM denial is working as intended. > The audit log will be noisy, but viewed as a whole the denials > aren't > false positives. What I was talking about was EMODPE[PROT_WRITE] on an RX page. > > The potential for noisy audit logs and/or false positives is unfortunate, > but it's (by far) the lesser of many evils. > > > Theoretically speaking, what you really need is a per page flag (let's > > name it WRITTEN?) indicating whether a page has ever been written to > > (or more precisely, granted PROT_WRITE), which will be used to decide > > whether to grant PROT_EXEC when requested in future. Given the fact > > that all mprotect() goes through LSM and mmap() is limited to > > PROT_NONE, it's easy for LSM to capture that flag by itself instead of > asking user mode code to provide it. > > > > That said, here is the summary of what I think is a better approach. > > * In hook security_file_alloc(), if @file is an enclave, allocate some > data > > structure to store for every page, the WRITTEN flag as described > above. > > WRITTEN is cleared initially for all pages. > > This would effectively require *every* LSM to duplicate the SGX driver's > functionality, e.g. track per-page metadata, implement locking to > prevent races between multiple mm structs, etc... Architecturally we shouldn't dictate how LSM makes decisions. ALLOW_* are no difference than PROCESS__* or FILE__* flags, which are just artifacts to assist particular LSMs in decision making. They are never considered part of the LSM interface, even if other LSMs than SELinux may adopt the same/similar approach. If code duplication is what you are worrying about, you can put them in a library, or implement/export them in some new file (maybe security/enclave.c?) as utility functions. But spilling them into user mode is what I think is unacceptable. > > > Open: Given a file of type struct file *, how to tell if it is an > enclave (i.e. /dev/sgx/enclave)? > > * In hook security_mmap_file(), if @file is an enclave, make sure > @prot can > > only be PROT_NONE. This is to force all protection changes to go > through > > security_file_mprotect(). > > * In the newly introduced hook security_enclave_load(), set WRITTEN > for pages > > that are requested PROT_WRITE. > > How would an LSM associate a page with a specific enclave? vma->vm_file > will point always point at /dev/sgx/enclave. vma->vm_mm is useless > because we're allowing multiple processes to map a single enclave, not > to mention that by mm would require holding a reference to the mm. Each open("/dev/sgx/enclave") syscall creates a *new* instance of struct file to uniquely identify one enclave instance. What I mean is @vma->vm_file, not @vma->vm_file->f_path or @vma->vm_file->f_inode. > > > * In hook security_file_mprotect(), if @vma->vm_file is an enclave, > look up > > and use WRITTEN flags for all pages within @vma, along with other > global > > flags (e.g. PROCESS__EXECMEM/FILE__EXECMOD in the case of SELinux) > to decide > > on allowing/rejecting @prot. > > vma->vm_file will always be /dev/sgx/enclave at this point, which means > LSMs don't have the necessary anchor back to the source file, e.g. to > enforce FILE__EXECUTE. The noexec file system case is also unaddressed. vma->vm_file identifies an enclave instance uniquely. FILE__EXECUTE is checked by security_enclave_load() using @source_vma->vm_file. Once a page has been EADD'ed, whether to allow RW->RX depends on .sigstruct file (more precisely, the file backing SIGSTRUCT), whose FILE__* attributes could be cached in vma->vm_file->f_security by security_enclave_init(). The noexec case should be addressed in IOC_ADD_PAGES by testing @source_vma->vm_flags & VM_MAYEXEC. > > > * In hook security_file_free(), if @file is an enclave, free storage > > allocated for WRITTEN flags.