Re: Testing 5.17 bugfix material

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 24, 2022 at 07:42:08PM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 22, 2022 at 01:15:01PM -0600, Haitao Huang wrote:
> > Hi Dave,
> > 
> > On Fri, 21 Jan 2022 13:57:50 -0600, Dave Hansen <dave.hansen@xxxxxxxxx>
> > wrote:
> > 
> > > Hi Everyone,
> > > 
> > > There are a few SGX fixes that have showed up in the last week or so,
> > > mostly around RAS and fixing the backing storage issues.  Could folks
> > > please give this branch a good thrashing?
> > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=x86/sgx
> > > 
> > > I'm planning to send this bunch up to Linus after 5.17-rc1 comes out.
> > > 
> > > Kristen, I really dug into the changelogs of your two patches to make it
> > > more clear that they are bugfix and stable@ material.  I'd appreciate
> > > some additional eyeballs there.
> > 
> > When testing this with a large enclave loaded by Intel runtime, we saw it
> > hang
> > on EADD ioctl and the ioctl never returns.
> > 
> > Also noticed the selftest fails on oversub but the return errno is EINTR not
> > ENOMEM as expected.
> > 
> > When user space register signal handler with SA_RESTART flag, EINTR would
> > be an auto restart of ioctl. So that might be what's going on with Intel
> > runtime test. And I suspect following code may cause infinite restart of the
> > ioctl:
> > 
> > struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
> > {
> >       struct sgx_epc_page *page;
> > 
> >       for ( ; ; ) {
> >           page = __sgx_alloc_epc_page();
> >           if (!IS_ERR(page)) {
> >               page->owner = owner;
> >               break;
> >           }
> > 
> >           if (list_empty(&sgx_active_page_list))<-- should also check
> > sgx_nr_available_backing_pages?
> >               return ERR_PTR(-ENOMEM);

I don't think so but why do you think it should?

> > 
> >           if (!reclaim) {
> >               page = ERR_PTR(-EBUSY);
> >               break;
> >           }
> > 
> >           if (signal_pending(current)) {
> >               page = ERR_PTR(-ERESTARTSYS);<---- this is the only exit of
> > the for loop when backing store limit reaches.
> >               break;
> >           }
> > 
> >           sgx_reclaim_pages(); <---- no checking for backing store fail due
> > to limit here.

Initially I'd think you're making here the right conclusion. Given the
limit sgx_reclaim_pages() should return e.g. boolean or int to tell that
we are out of backing storage.

Kristen?

> >           cond_resched();
> >       }
> > 
> > Thanks
> > Haitao

The reason was that it was not x86/sgx branch of tip but this:

https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=x86/sgx

Sorry for initial reject.

BR, Jarkko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux