Re: Testing 5.17 bugfix material

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 22, 2022 at 01:15:01PM -0600, Haitao Huang wrote:
> Hi Dave,
> 
> On Fri, 21 Jan 2022 13:57:50 -0600, Dave Hansen <dave.hansen@xxxxxxxxx>
> wrote:
> 
> > Hi Everyone,
> > 
> > There are a few SGX fixes that have showed up in the last week or so,
> > mostly around RAS and fixing the backing storage issues.  Could folks
> > please give this branch a good thrashing?
> > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=x86/sgx
> > 
> > I'm planning to send this bunch up to Linus after 5.17-rc1 comes out.
> > 
> > Kristen, I really dug into the changelogs of your two patches to make it
> > more clear that they are bugfix and stable@ material.  I'd appreciate
> > some additional eyeballs there.
> 
> When testing this with a large enclave loaded by Intel runtime, we saw it
> hang
> on EADD ioctl and the ioctl never returns.
> 
> Also noticed the selftest fails on oversub but the return errno is EINTR not
> ENOMEM as expected.
> 
> When user space register signal handler with SA_RESTART flag, EINTR would
> be an auto restart of ioctl. So that might be what's going on with Intel
> runtime test. And I suspect following code may cause infinite restart of the
> ioctl:
> 
> struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
> {
>       struct sgx_epc_page *page;
> 
>       for ( ; ; ) {
>           page = __sgx_alloc_epc_page();
>           if (!IS_ERR(page)) {
>               page->owner = owner;
>               break;
>           }
> 
>           if (list_empty(&sgx_active_page_list))<-- should also check
> sgx_nr_available_backing_pages?
>               return ERR_PTR(-ENOMEM);
> 
>           if (!reclaim) {
>               page = ERR_PTR(-EBUSY);
>               break;
>           }
> 
>           if (signal_pending(current)) {
>               page = ERR_PTR(-ERESTARTSYS);<---- this is the only exit of
> the for loop when backing store limit reaches.
>               break;
>           }
> 
>           sgx_reclaim_pages(); <---- no checking for backing store fail due
> to limit here.
>           cond_resched();
>       }
> 
> Thanks
> Haitao

Haitao, this does not match what I see in tip/x86/sgx:

struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
{
	struct sgx_epc_page *page;

	for ( ; ; ) {
		page = __sgx_alloc_epc_page();
		if (!IS_ERR(page)) {
			page->owner = owner;
			break;
		}

		if (list_empty(&sgx_active_page_list))
			return ERR_PTR(-ENOMEM);

		if (!reclaim) {
			page = ERR_PTR(-EBUSY);
			break;
		}

		if (signal_pending(current)) {
			page = ERR_PTR(-ERESTARTSYS);
			break;
		}

		sgx_reclaim_pages();
		cond_resched();
	}

	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
		wake_up(&ksgxd_waitq);

	return page;
}

Just in case I also checked tip/master, and the result is the same.

/Jarkko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux