Hi Dave,
On Fri, 21 Jan 2022 13:57:50 -0600, Dave Hansen <dave.hansen@xxxxxxxxx>
wrote:
Hi Everyone,
There are a few SGX fixes that have showed up in the last week or so,
mostly around RAS and fixing the backing storage issues. Could folks
please give this branch a good thrashing?
https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=x86/sgx
I'm planning to send this bunch up to Linus after 5.17-rc1 comes out.
Kristen, I really dug into the changelogs of your two patches to make it
more clear that they are bugfix and stable@ material. I'd appreciate
some additional eyeballs there.
When testing this with a large enclave loaded by Intel runtime, we saw it
hang
on EADD ioctl and the ioctl never returns.
Also noticed the selftest fails on oversub but the return errno is EINTR
not
ENOMEM as expected.
When user space register signal handler with SA_RESTART flag, EINTR would
be an auto restart of ioctl. So that might be what's going on with Intel
runtime test. And I suspect following code may cause infinite restart of
the
ioctl:
struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
{
struct sgx_epc_page *page;
for ( ; ; ) {
page = __sgx_alloc_epc_page();
if (!IS_ERR(page)) {
page->owner = owner;
break;
}
if (list_empty(&sgx_active_page_list))<-- should also check
sgx_nr_available_backing_pages?
return ERR_PTR(-ENOMEM);
if (!reclaim) {
page = ERR_PTR(-EBUSY);
break;
}
if (signal_pending(current)) {
page = ERR_PTR(-ERESTARTSYS);<---- this is the only exit of
the for loop when backing store limit reaches.
break;
}
sgx_reclaim_pages(); <---- no checking for backing store fail due
to limit here.
cond_resched();
}
Thanks
Haitao