Re: SGX vs LSM (Re: [PATCH v20 00/28] Intel SGX1 support)

Stephen Smalley <sds@xxxxxxxxxxxxx> · Fri, 17 May 2019 15:20:46 -0400

On 5/17/19 2:05 PM, Stephen Smalley wrote:
On 5/17/19 1:12 PM, Andy Lutomirski wrote:

On May 17, 2019, at 9:37 AM, Stephen Smalley <sds@xxxxxxxxxxxxx> wrote:

On 5/17/19 12:20 PM, Stephen Smalley wrote:
On 5/17/19 11:09 AM, Sean Christopherson wrote:
On Fri, May 17, 2019 at 09:53:06AM -0400, Stephen Smalley wrote:
On 5/16/19 6:23 PM, Xing, Cedric wrote:
I thought EXECMOD applied to files (and memory mappings backed by 
them) but
I was probably wrong. It sounds like EXECMOD applies to the whole 
process so
would allow all pages within a process's address space to be 
modified then
executed, regardless the backing files. Am I correct this time?

No, you were correct the first time I think; EXECMOD is used to 
control
whether a process can make executable a private file mapping that has
previously been modified (e.g. text relocation); it is a special 
case to
support text relocations without having to allow full EXECMEM 
(i.e. execute
arbitrary memory).

SELinux checks relevant to W^X include:

- EXECMEM: mmap/mprotect PROT_EXEC an anonymous mapping 
(regardless of
PROT_WRITE, since we know the content has to have been written at 
some
point) or a private file mapping that is also PROT_WRITE.
- EXECMOD: mprotect PROT_EXEC a private file mapping that has been
previously modified, typically for text relocations,
- FILE__WRITE: mmap/mprotect PROT_WRITE a shared file mapping,
- FILE__EXECUTE: mmap/mprotect PROT_EXEC a file mapping.

(ignoring EXECSTACK and EXECHEAP here since they aren't really 
relevant to
this discussion)

So if you want to ensure W^X, then you wouldn't allow EXECMEM for the
process, EXECMOD by the process to any file, and the combination 
of both
FILE__WRITE and FILE__EXECUTE by the process to any file.

If the /dev/sgx/enclave mappings are MAP_SHARED and you aren't 
using an
anonymous inode, then I would expect that only the FILE__WRITE and
FILE__EXECUTE checks are relevant.

Yep, I was just typing this up in a different thread:

I think we may want to change the SGX API to alloc an anon inode 
for each
enclave instead of hanging every enclave off of the 
/dev/sgx/enclave inode.
Because /dev/sgx/enclave is NOT private, SELinux's 
file_map_prot_check()
will only require FILE__WRITE and FILE__EXECUTE to mprotect() 
enclave VMAs
to RWX.  Backing each enclave with an anon inode will make SELinux 
treat
EPC memory like anonymous mappings, which is what we want (I 
think), e.g.
making *any* EPC page executable will require PROCESS__EXECMEM (SGX is
64-bit only at this point, so SELinux will always have 
default_noexec).
I don't think we want to require EXECMEM (or equivalently both 
FILE__WRITE and FILE__EXECUTE to /dev/sgx/enclave) for making any 
EPC page executable, only if the page is also writable or previously 
modified.  The intent is to prevent arbitrary code execution without 
EXECMEM (or FILE__WRITE|FILE__EXECUTE), while still allowing 
enclaves to be created without EXECMEM as long as the EPC page 
mapping is only ever mapped RX and its initial contents came from an 
unmodified file mapping that was PROT_EXEC (and hence already 
checked via FILE__EXECUTE).

Also, just to be clear, there is nothing inherently better about 
checking EXECMEM instead of checking both FILE__WRITE and 
FILE__EXECUTE to the /dev/sgx/enclave inode, so I wouldn't switch to 
using anon inodes for that reason.  Using anon inodes also 
unfortunately disables SELinux inode-based checking since we no 
longer have any useful inode information, so you'd lose out on 
SELinux ioctl whitelisting on those enclave inodes if that matters.

How can that work?  Unless the API changes fairly radically, users 
fundamentally need to both write and execute the enclave.  Some of it 
will be written only from already executable pages, and some privilege 
should be needed to execute any enclave page that was not loaded like 
this.

I'm not sure what the API is. Let's say they do something like this:

fd = open("/dev/sgx/enclave", O_RDONLY);
addr = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
stuff addr into ioctl args
ioctl(fd, ENCLAVE_CREATE, &ioctlargs);
ioctl(fd, ENCLAVE_ADD_PAGE, &ioctlargs);
ioctl(fd, ENCLAVE_INIT, &ioctlargs);

The important points are that they do not open /dev/sgx/enclave with 
write access (otherwise they will trigger FILE__WRITE at open time, and 
later encounter FILE__EXECUTE as well during mmap, thereby requiring 
both to be allowed to /dev/sgx/enclave), and that they do not request 
PROT_WRITE to the resulting mapping (otherwise they will trigger 
FILE__WRITE at mmap time).  Then only FILE__READ and FILE__EXECUTE are 
required to /dev/sgx/enclave in policy.

If they switch to an anon inode, then any mmap PROT_EXEC of the opened 
file will trigger an EXECMEM check, at least as currently implemented, 
as we have no useful backing inode information.

FWIW, looking at the selftest for SGX in the patch series, they open 
/dev/sgx/enclave O_RDWR (probably not necessary?) and mmap the open file 
RWX.  If that is necessary then I'd rather it show up as FILE__WRITE and 
FILE__EXECUTE to /dev/sgx/enclave instead of EXECMEM, so that we can 
allow the process the ability to perform that mmap without allowing it 
to make other mappings WX.  So staying with the single /dev/sgx/enclave 
inode is better in that regard.