Re: [PATCH v38 10/24] mm: Add vm_ops->mprotect()

Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> · Wed, 23 Sep 2020 17:33:05 +0300

On Tue, Sep 22, 2020 at 08:11:14AM -0700, Dave Hansen wrote:
> Now I'm confused.  I actually don't think I have a strong understanding
> of how an enclave actually gets loaded, how mmap() and mprotect() are
> normally used and what this hook is intended to thwart.

You saw my other comments. I scraped this together based on your
feedback and my responses:

"
mm: Add 'mprotect' callback to vm_ops

Intel Sofware Guard eXtensions (SGX) allows creation of blobs called
enclaves, for which page permissions are defined when the enclave is first
loaded. Once an enclave is loaded and initialized, it can be mapped to the
process address space.

There is no standard file format for enclaves. They are dynamically built
and the ways how enclaves are deployed differ greatly. For an app you might
want to have a simple static binary, but on the other hand for a container
you might want to dynamically create the whole thing at run-time. Also, the
existing ecosystem for SGX is already large, which would make the task very
hard.

Finally, even if there was a standard format, one would still want a
dynamic way to add pages to the enclave. One big reason for this is that
enclaves have load time defined pages that represent entry points to the
enclave. Each entry point can service one hardware thread at a time and
you might want to run-time parametrize this depending on your environment.

The consequence is that enclaves are best created with an ioctl API and the
access control can be based only to the origin of the source file for the
enclave data, i.e. on VMA file pointer and page permissions. For example,
this could be done with LSM hooks that are triggered in the appropriate
ioctl's and they could make the access control decision based on this
information.

Unfortunately, there is ENCLS[EMODPE] that a running enclave can use to
upgrade its permissions. If we do not limit mmap() and mprotect(), enclave
could upgrade its permissions by using EMODPE followed by an appropriate
mprotect() call. This would be completely hidden from the kernel.

Add 'mprotect' hook to vm_ops, so that a callback can be implemeted for SGX
that will ensure that {mmap, mprotect}() permissions do not surpass any of
the original page permissions. This feature allows to maintain and refine
sane access control for enclaves.
"

/Jarkko