Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andy,

On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
On 12/1/21 11:23, Reinette Chatre wrote:
Enclave creators declare their paging permission intent at the time
the pages are added to the enclave. These paging permissions are
vetted when pages are added to the enclave and stashed off
(in sgx_encl_page->vm_max_prot_bits) for later comparison with
enclave PTEs.


I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the EPCM permission bits however it likes with no oversight from the kernel.  So we end up with a whole bunch of permission masks:

Before jumping to the permission masks I would like to step back and just confirm the context. We need to consider the following three permissions:

EPCM permissions: the enclave page permissions maintained in the SGX hardware. The OS is constrained here in that it cannot query the current EPCM permissions. Even so, the OS needs to ensure PTEs are installed appropriately (we do not want a RW PTE for a read-only enclave page) and thus the OS keeps its own record of EPCM permissions to support this. As you note later even in current kernel the enclave can change these permissions without OS knowing. EPCM permissions can only be relaxed without the OS knowledge though so the OS record of EPCM permissions can only ever be stricter than the actual EPCM permissions.

VMA permissions: Current behavior (not changed in this series) is that the OS enforces that a new VMA should have the same or weaker permissions than the EPCM permissions.

Page table entries: These should match the EPCM permissions without exceeding VMA permissions.

The PTE: controlled by complex kernel policy

The VMA: with your series, this is entirely controlled by userspace.  I think I'm fine with that.

vm_max_prot_bits: populated from secinfo at setup time, unless I missed something that changes it later.  Maybe I'm confused or missed something in one of the patches,

Yes, vm_max_prot_bits is currently and continues to be populated from secinfo for pages added before the enclave is initialized and in a later patch it would be hardcoded to RW for pages that are added after the enclave is initialized. In the current implementation vm_max_prot_bits is the OS's record of the EPCM permissions used to guide VMA and PTE permissions.

On a higher level, the implementation decision is that vm_max_prot_bits is the static "vetted" permissions of a page - the maximum permissions a page is allowed to have during its entire lifetime. This matches the current implementation. In the current implementation permissions are only able to change via VMA and PTE ... for example a read-only VMA can access an enclave page with vm_max_prot_bits of RW. With the SGX2 support permission changes are allowed to EPCM permissions - but in this implementation they are not allowed to exceed the originally vetted vm_max_prot_bits.

In this SGX2 implementation an enclave page could thus be added to an enclave with secinfo and vm_max_prot_bits of RW that would only allow that page to have R or RW permissions (VMA, PTE, and OS view of EPCM permissions) in its lifetime, never RX or RWX. Yes, it may be possible for the enclave to change the EPCM permissions from within the enclave using ENCLU[EMODPE] but to access the page the enclave would need the OS to install the appropriate PTE and the OS would not do so if vm_max_prot_bits does not allow it. Neither would the OS allow an executable VMA.


vm_run_prot_bits: populated from some combination of ioctls.  I'm entirely lost as to what this is for.

With SGX2 it is possible to change the EPCM permissions of an enclave page after the enclave is initialized. vm_max_prot_bits would provide guidance to what permissions a page is allowed to have while vm_run_prot_bits contains the current view of EPCM permissions used by the OS to guide whether requested VMA permissions are allowed and guide what PTE permissions should be.

Consider this example how vm_max_prot_bits and vm_run_prot_bits are used:

(1) Add enclave page with secinfo of RW to uninitialized enclave
    vm_max_prot_bits = RW
    vm_run_prot_bits = RW

(2) User space runs SGX_IOC_PAGE_MODP to change the permissions to read-
    only. This is allowed because vm_max_prot_bits = RW. Now:
    vm_max_prot_bits = RW
    vm_run_prot_bits = R

    Now VMAs are created and PTEs installed based on value of
    vm_run_prot_bits - write access will not be allowed.

(3) User space runs SGX_IOC_PAGE_MODP to change the permissions to RX.
    This will be denied because vm_max_prot_bits = RW.

(3) User space runs SGX_IOC_PAGE_MODP to change the permissions to RW.
    This will be allowed because vm_max_prot_bits = RW.

    Now VMAs are created and PTEs installed based on value of
    vm_run_prot_bits - write access will again be allowed.


EPCM bits: controlled by the guest.  basically useless for any host purpose on SGX2 hardware (with or without kernel support -- the enclave can do ENCLU[EMODPE] whether we like it or not, even on old kernels)

Indeed - permissions can only be relaxed without the OS knowledge so the OS's view would always be the same or stricter than the enclave.

So I guess I don't understand the purpose of this patch    or of the rules in the later patches, and I feel like this is getting more complicated than makes sense.


Could we perhaps make vm_max_prot_bits dynamic or at least controllable in some useful way?  My initial proposal (years ago) was for vm_max_prot_bits to be *separately* configured at initial load time instead of being inferred from secinfo with the intent being that the user untrusted runtime would set it appropriately.  I have no problem with allowing runtime changes as long as the security policy makes sense and it's kept consistent with PTEs.

This SGX2 enabling indeed builds on the current implementation where vm_max_prot_bits is inferred from secinfo. The intent is for vm_max_prot_bits to reflect the maximum allowed vetted permissions.

At this time vm_max_prot_bits is indeed static and thus creates the need for (dynamic) vm_run_prot_bits that reflects the current EPCM permissions and guides VMA and PTE permissions while vm_max_prot_bits guides new permission requests. From what I understand this implementation follows the current security policy - permissions are never allowed to exceed the originally vetted permissions. PTEs are kept consistent in that they match the (vetted, OS view of) EPCM permissions.

Also, I think we need a changelog message or, even better, actual docs in kernel, explaining the actual final set of rules and invariants for all these masks.

I will add a section to Documentation/x86/sgx.rst.

Reinette




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux