On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote: > SGX runtimes such as Gramine may implement EDMM-based lazy allocation of > enclave pages and may support MADV_DONTNEED semantics [1]. The former > implies #PF-based page allocation, and the latter implies the usage of > SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl. > > A trivial program like below (run under Gramine and with EDMM enabled) > stresses these two flows in the SGX driver and hangs: > > /* repeatedly touch different enclave pages at random and mix with > * `madvise(MADV_DONTNEED)` to stress EAUG/EREMOVE flows */ > static void* thread_func(void* arg) { > size_t num_pages = 0xA000 / page_size; > for (int i = 0; i < 5000; i++) { > size_t page = get_random_ulong() % num_pages; > char data = READ_ONCE(((char*)arg)[page * page_size]); > > page = get_random_ulong() % num_pages; > madvise(arg + page * page_size, page_size, MADV_DONTNEED); > } > } > > addr = mmap(NULL, 0xA000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS, -1, 0); > pthread_t threads[16]; > for (int i = 0; i < 16; i++) > pthread_create(&threads[i], NULL, thread_func, addr); I'm not convinced that kernel is the problem here but it could be also how Gramine is implemented. So maybe you could make a better case of that. The example looks a bit artificial to me. > > This program uncovers two data races in the SGX driver. The remaining > patches describe and fix these races. > > I performed several stress tests to verify that there are no other data > races (at least with the test program above): > > - On Icelake server with 128GB of PRMRR (EPC), without madvise(). This > stresses the first data race. A Gramine SGX test suite running in the > background for additional stressing. Result: 1,000 runs without hangs > (result without the first bug fix: hangs every time). > - On Icelake server with 128GB of PRMRR (EPC), with madvise(). This > stresses the second data race. A Gramine SGX test suite running in the > background for additional stressing. Result: 1,000 runs without hangs > (result with the first bug fix but without the second bug fix: hangs > approx. once in 50 runs). > - On Icelake server with 4GB of PRMRR (EPC), with madvise(). This > additionally stresses the enclave page swapping flows. Two Gramine SGX > test suites running in the background for additional stressing of > swapping (I observe 100% CPU utilization from ksgxd which confirms that > swapping happens). Result: 1,000 runs without hangs. > > (Sorry for the previous copy of this email, accidentally sent to > stable@xxxxxxxxxxxxxxx. Failed to use `--suppress-cc` during a test send.) > > Dmitrii Kuvaiskii (2): > x86/sgx: Resolve EAUG race where losing thread returns SIGBUS > x86/sgx: Resolve EREMOVE page vs EAUG page data race > > arch/x86/kernel/cpu/sgx/encl.c | 10 +++++++--- > arch/x86/kernel/cpu/sgx/encl.h | 3 +++ > arch/x86/kernel/cpu/sgx/ioctl.c | 1 + > 3 files changed, 11 insertions(+), 3 deletions(-) BR, Jarkko