Hi Jarkko, On 3/3/2022 3:12 PM, Jarkko Sakkinen wrote: > On Wed, Mar 02, 2022 at 02:57:45PM -0800, Reinette Chatre wrote: >>> What do you mean by "user space policy" anyway exactly? I'm sorry but I >>> just don't fully understand this. >> >> My apologies - I just assumed that you would need no reminder about this contentious >> part of SGX history. Essentially it means that, yes, the kernel could theoretically >> permit any kind of access to any file/page, but some accesses are known to generally >> be a bad idea - like making memory executable as well as writable - and thus there >> are additional checks based on what user space permits before the kernel allows >> such accesses. > > The device files are limited by a GID (in systemd upstream), which is a > "user policy". > > What you want to add and why augmentation cannot be made complete before > the unknown factor is added to the access control? After studying this part of SGX history I learned that unfortunately none of the existing user policy controls have been found to be a perfect fit for enclaves. Current user policy type permissions are associated with files and processes and enclaves have properties of both. One process can execute multiple enclaves and only one/some of those enclaves may require to execute dirty pages. Associating a permission to execute dirty pages with the process, and thus giving that ability to all of its enclaves, is not ideal. Similarly, the file /dev/sgx_enclave can represent multiple enclaves used by multiple processes and a file permission is similarly too broad. What I was planning to propose and discuss after the SGX2 core enabling was an ability for user space to uniquely identify enclaves that require the ability to execute dirty pages. This identification can be specified by using enclave properties like MRENCLAVE and MRSIGNER. Executing dirty pages would only be allowed for these specific enclaves identified to require this ability. A solution like this is possible using the kernel's keys subsystem by introducing a new "enclave_execdirty" key that contains these properties. I have this working as a PoC. Perhaps the SGX_IOC_ENCLAVE_AUGMENT_PAGES what you propose can also be seen as a solution to support user space policy ... instead that it is more fine grained in that it is used to identify specific memory ranges within specific enclaves that are allowed to execute dirty pages. What do you think? >>>>> I think the best way to move forward would be to do EAUG's explicitly with >>>>> an ioctl that could also include secinfo for permissions. Then you can >>>>> easily do the rest with EACCEPTCOPY inside the enclave. >>>> >>>> SGX_IOC_ENCLAVE_ADD_PAGES already exists and could possibly be used for >>>> this purpose. It already includes SECINFO which may also be useful if >>>> needing to later support EAUG of PT_SS* pages. >>> >>> You could also simply add SGX_IOC_ENCLAVE_AUGMENT_PAGES and call it a day. >> >> I could, yes. > > And this enables EACCEPTCOPY pattern nicely. > > E.g. you can implement mmap() with EAUG and then EACCEPTCOPY feeded with > permissions and a zero page: > > 1. enclave calls back to host to do mmap() > 2. host does eaug on given range and enter back to enclave. > 3. enclave does eacceptcopy with given permissions and a zero page. > >>> I don't like this type of re-use of the existing API. >> >> I could proceed with SGX_IOC_ENCLAVE_AUGMENT_PAGES if there is consensus after >> considering the user policy question (above) and performance trade-off (more below). > > Ok. > > If adding this would be a bottleneck it would be already persistent int > "add pages", so whatever limitation there might be, it already exist. Currently this checking is built in as part of "add pages", for example, user space is prevented from circumventing existing protections on the source pages with the "vma->vm_flags & VM_MAYEXEC" check in __sgx_encl_add_page(). Further, there is trust here in that the pages added before enclave initialization are accompanied by their secinfo with the permissions of the pages and those values are included in the measurement (MRENCLAVE) of the final enclave. The maximum permissions any enclave page specified during "add pages" may have is "locked down" during this time. Permissions of EAUG pages are not included in the MRENCLAVE of the enclave and there is no backing memory that can be referenced to learn what is already allowed. It is possible that some of the code dynamically loaded into the enclave could indeed be buggy or malicious so effort should be made to only allow executing of dirty pages to those enclaves specified to require the ability. > Thus, logically, that could be safely added without worrying about user > policies all that much... > >> >>> >>>> The big question is whether communicating user policy after enclave initialization >>>> via the SECINFO within SGX_IOC_ENCLAVE_ADD_PAGES is acceptable to all? I would >>>> appreciate a confirmation on this direction considering the significant history >>>> behind this topic. >>> >>> I have no idea because I don't know what is user space policy. >> >> This discussion is about some enclave usages needing RWX permissions >> on dynamically added enclave pages. RWX permissions on dynamically added pages is > > I'm not sure if that is actually necessary, if you use EAUG-EACCEPTCOPY > type of pattern. Please correct if I'm wrong. This only takes EPCM permissions into account. The issue comes in when the kernel needs to determine whether it should allow the PTEs pointing to these pages to be executable. To elaborate your example, to use dynamically added RWX pages EAUG->EACCEPTCOPY->SGX_IOC_ENCLAVE_RELAX_PERMISSIONS is required and SGX_IOC_ENCLAVE_RELAX_PERMISSIONS will only allow PTEs that are allowed. In the driver sgx_encl_page->vm_max_prot_bits dictates what permissions are allowed and SGX_IOC_ENCLAVE_RELAX_PERMISSIONS will return EPERM if an attempt is made to relax permissions beyond that. When considering the user space policy integration, sgx_encl_page->vm_max_prot_bits will be initialized to reflect allowed permissions, RWX if the enclave is so allowed, in this way EAUG pages can be made executable using SGX_IOC_ENCLAVE_RELAX_PERMISSIONS. >> not something that should blindly be allowed for all SGX enclaves but instead the user >> needs to explicitly allow specific enclaves to have such ability. This is equivalent >> to (but not the same as) what exists in Linux today with LSM. As seen in >> mm/mprotect.c:do_mprotect_pkey()->security_file_mprotect() Linux is able to make >> files and memory be both writable and executable, but it would only do so for those >> files and memory that the LSM (which is how user policy is communicated, like SELinux) >> indicates it is allowed, not blindly do so for all files and all memory. > > We could also potentially make LSM hooks to ioctls, if that is ever needed. Could you please elaborate? > > And as I said earlier, EAUG ioctl does not make things any worse they might > be. I hope my earlier comments noting the differences with adding pages shine some light here. > >>>>> Putting EAUG to the #PF handler and implicitly call it just too flakky and >>>>> hard to make deterministic for e.g. JIT compiler in our use case (not to >>>>> mention that JIT is not possible at all because inability to do RX pages). >> >> I understand how SGX_IOC_ENCLAVE_AUGMENT_PAGES can be more deterministic but from >> what I understand it would have a performance impact since it would require all memory >> that may be needed by the enclave be pre-allocated from outside the enclave and not >> just dynamically allocated from within the enclave at the time it is needed. >> >> Would such a performance impact be acceptable? > > IMHO yes because bad behaving enclave can cause the same issue anyway, > and more indeterministic manner. With EAUG pages supported in the page fault handler it is possible to support both usages. Especially now that Dave provided guidance on how to support MAP_POPULATE. As I understand, when MAP_POPULATE is supported a usage needing deterministic behavior can pre-fault all the EAUG pages while those usages mapping a lot of memory that mostly will go unused are also supported. Reinette