On Sat, Jan 22, 2022 at 01:15:01PM -0600, Haitao Huang wrote: > Hi Dave, > > On Fri, 21 Jan 2022 13:57:50 -0600, Dave Hansen <dave.hansen@xxxxxxxxx> > wrote: > > > Hi Everyone, > > > > There are a few SGX fixes that have showed up in the last week or so, > > mostly around RAS and fixing the backing storage issues. Could folks > > please give this branch a good thrashing? > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=x86/sgx > > > > I'm planning to send this bunch up to Linus after 5.17-rc1 comes out. > > > > Kristen, I really dug into the changelogs of your two patches to make it > > more clear that they are bugfix and stable@ material. I'd appreciate > > some additional eyeballs there. > > When testing this with a large enclave loaded by Intel runtime, we saw it > hang > on EADD ioctl and the ioctl never returns. > > Also noticed the selftest fails on oversub but the return errno is EINTR not > ENOMEM as expected. > > When user space register signal handler with SA_RESTART flag, EINTR would > be an auto restart of ioctl. So that might be what's going on with Intel > runtime test. And I suspect following code may cause infinite restart of the > ioctl: > > struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) > { > struct sgx_epc_page *page; > > for ( ; ; ) { > page = __sgx_alloc_epc_page(); > if (!IS_ERR(page)) { > page->owner = owner; > break; > } > > if (list_empty(&sgx_active_page_list))<-- should also check > sgx_nr_available_backing_pages? > return ERR_PTR(-ENOMEM); > > if (!reclaim) { > page = ERR_PTR(-EBUSY); > break; > } > > if (signal_pending(current)) { > page = ERR_PTR(-ERESTARTSYS);<---- this is the only exit of > the for loop when backing store limit reaches. > break; > } > > sgx_reclaim_pages(); <---- no checking for backing store fail due > to limit here. > cond_resched(); > } > > Thanks > Haitao Haitao, this does not match what I see in tip/x86/sgx: struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) { struct sgx_epc_page *page; for ( ; ; ) { page = __sgx_alloc_epc_page(); if (!IS_ERR(page)) { page->owner = owner; break; } if (list_empty(&sgx_active_page_list)) return ERR_PTR(-ENOMEM); if (!reclaim) { page = ERR_PTR(-EBUSY); break; } if (signal_pending(current)) { page = ERR_PTR(-ERESTARTSYS); break; } sgx_reclaim_pages(); cond_resched(); } if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) wake_up(&ksgxd_waitq); return page; } Just in case I also checked tip/master, and the result is the same. /Jarkko