On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote: > SGX EPC memory allocations are separate from normal RAM allocations, and is > managed solely by the SGX subsystem. The existing cgroup memory controller > cannot be used to limit or account for SGX EPC memory, which is a desirable > feature in some environments, e.g., support for pod level control in a > Kubernates cluster on a VM or baremetal host [1,2] in those environments. > > This patchset implements the support for sgx_epc memory within the misc > cgroup controller. The user can use the misc cgroup controller to set and > enforce a max limit on total EPC usage per cgroup. The implementation > reports current usage and events of reaching the limit per cgroup as well > as the total system capacity. > > This work was originally authored by Sean Christopherson a few years ago, > and previously modified by Kristen C. Accardi to work with more recent > kernels, and to utilize the misc cgroup controller rather than a custom > controller. Now I updated the patches based on review comments on the V2 > series[3], simplified a few aspects of the implementation/design and fixed > some stability issues found from testing, while keeping the same user space > facing interfaces. > > The patchset adds support for multiple LRUs to track both reclaimable EPC > pages (i.e. pages the reclaimer knows about), as well as unreclaimable EPC > pages (i.e. pages which the reclaimer isn't aware of, such as VA pages). > These pages are assigned to an LRU, as well as an enclave, so that an > enclave's full EPC usage can be tracked, and limited to a max value. During > OOM events, an enclave can be have its memory zapped, and all the EPC pages > not tracked by the reclaimer can be freed. > > I appreciate your comments and feedback. > > Summary of changes from v2: (more details in commit logs) > > * Added EPC states to replace flags in sgx_epc_page struct. (Jarkko) > * Unrolled wrappers for cond_resched, list (Dave) > * Separate patches for adding reclaimable and unreclaimable lists. (Dave) > * Other improvments on patch flow, commit messages, styles. (Dave, Jarkko) > * Simplified the cgroup tree walking with plain > css_for_each_descendant_pre. > * Fixed race conditions and crashes. > * OOM killer to wait for the victim enclave pages being reclaimed. > * Unblock the user by handling misc_max_write callback asynchronously. > * Rebased onto 6.4 and no longer base this series on the MCA patchset. > * Fix an overflow in misc_try_charge. > * Fix a NULL pointer in SGX PF handler. > * Updated and included the SGX selftest patches previously reviewed. Those > patches fix issues triggered in high EPC pressure required for cgroup > testing. > * Added test scripts to help setup and test SGX EPC cgroups. > > [1]https://lore.kernel.org/all/DM6PR21MB11772A6ED915825854B419D6C4989@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > [2]https://lore.kernel.org/all/ZD7Iutppjj+muH4p@himmelriiki/ > [3]https://lore.kernel.org/all/20221202183655.3767674-1-kristen@xxxxxxxxxxxxxxx/ > [4]Documentation/arch/x86/sgx.rst, Section "Virtual EPC" > > Haitao Huang (6): > x86/sgx: Store struct sgx_encl when allocating new VA pages > x86/sgx: Introduce EPC page states > x86/sgx: fix a NULL pointer > cgroup/misc: Fix an overflow > selftests/sgx: Retry the ioctl()'s returned with EAGAIN > selftests/sgx: Add scripts for epc cgroup testing > > Jarkko Sakkinen (3): > selftests/sgx: Move ENCL_HEAP_SIZE_DEFAULT to main.c > selftests/sgx: Use encl->encl_size in sigstruct.c > selftests/sgx: Include the dynamic heap size to the ELRANGE > calculation > > Kristen Carlson Accardi (9): > x86/sgx: Add 'struct sgx_epc_lru_lists' to encapsulate lru list(s) > x86/sgx: Use sgx_epc_lru_lists for existing active page list > x86/sgx: Store reclaimable epc pages in sgx_epc_lru_lists > x86/sgx: store unreclaimable EPC pages in sgx_epc_lru_lists > x86/sgx: Use a list to track to-be-reclaimed pages > cgroup/misc: Add per resource callbacks for CSS events > cgroup/misc: Add SGX EPC resource type and export APIs for SGX driver > x86/sgx: Limit process EPC usage with misc cgroup controller > Docs/x86/sgx: Add description for cgroup support > > Sean Christopherson (9): > x86/sgx: Add EPC page flags to identify owner type > x86/sgx: Introduce RECLAIM_IN_PROGRESS state > x86/sgx: Allow reclaiming up to 32 pages, but scan 16 by default > x85/sgx: Return the number of EPC pages that were successfully > reclaimed > x86/sgx: Add option to ignore age of page during EPC reclaim > x86/sgx: Prepare for multiple LRUs > x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup > x86/sgx: Add helper to grab pages from an arbitrary EPC LRU > x86/sgx: Add EPC OOM path to forcefully reclaim EPC > > Vijay Dhanraj (1): > selftests/sgx: Add SGX selftest augment_via_eaccept_long > > Documentation/arch/x86/sgx.rst | 77 ++++ > arch/x86/Kconfig | 13 + > arch/x86/kernel/cpu/sgx/Makefile | 1 + > arch/x86/kernel/cpu/sgx/driver.c | 27 +- > arch/x86/kernel/cpu/sgx/encl.c | 95 +++- > arch/x86/kernel/cpu/sgx/encl.h | 4 +- > arch/x86/kernel/cpu/sgx/epc_cgroup.c | 406 ++++++++++++++++++ > arch/x86/kernel/cpu/sgx/epc_cgroup.h | 60 +++ > arch/x86/kernel/cpu/sgx/ioctl.c | 25 +- > arch/x86/kernel/cpu/sgx/main.c | 406 ++++++++++++++---- > arch/x86/kernel/cpu/sgx/sgx.h | 113 ++++- > include/linux/misc_cgroup.h | 34 ++ > kernel/cgroup/misc.c | 63 ++- > tools/testing/selftests/sgx/load.c | 8 +- > tools/testing/selftests/sgx/main.c | 177 +++++++- > tools/testing/selftests/sgx/main.h | 6 +- > .../selftests/sgx/run_tests_in_misc_cg.sh | 68 +++ > tools/testing/selftests/sgx/setup_epc_cg.sh | 29 ++ > tools/testing/selftests/sgx/sigstruct.c | 8 +- > .../selftests/sgx/watch_misc_for_tests.sh | 13 + > 20 files changed, 1446 insertions(+), 187 deletions(-) > create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c > create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h > create mode 100755 tools/testing/selftests/sgx/run_tests_in_misc_cg.sh > create mode 100755 tools/testing/selftests/sgx/setup_epc_cg.sh > create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh > > -- > 2.25.1 Thanks for taking the effort, must have been tedious! BR, Jarkko