Re: [PATCH] x86/sgx: Silence softlockup detection when releasing large enclaves

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 26, 2022 at 04:29:12PM +0200, Jarkko Sakkinen wrote:
> On Thu, Jan 20, 2022 at 08:28:36AM -0800, Reinette Chatre wrote:
> > Hi Jarkko,
> > 
> > On 1/20/2022 5:01 AM, Jarkko Sakkinen wrote:
> > > On Tue, 2022-01-18 at 11:14 -0800, Reinette Chatre wrote:
> > >> Vijay reported that the "unclobbered_vdso_oversubscribed" selftest
> > >> triggers the softlockup detector.
> > >>
> > >> Actual SGX systems have 128GB of enclave memory or more.  The
> > >> "unclobbered_vdso_oversubscribed" selftest creates one enclave which
> > >> consumes all of the enclave memory on the system. Tearing down such a
> > >> large enclave takes around a minute, most of it in the loop where
> > >> the EREMOVE instruction is applied to each individual 4k enclave
> > >> page.
> > >>
> > >> Spending one minute in a loop triggers the softlockup detector.
> > >>
> > >> Add a cond_resched() to give other tasks a chance to run and placate
> > >> the softlockup detector.
> > >>
> > >> Cc: stable@xxxxxxxxxxxxxxx
> > >> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> > >> Reported-by: Vijay Dhanraj <vijay.dhanraj@xxxxxxxxx>
> > >> Acked-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> > >> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> > >> ---
> > >> Softlockup message:
> > >> watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502]
> > >> Kernel panic - not syncing: softlockup: hung tasks
> > >> <snip>
> > >> sgx_encl_release+0x86/0x1c0
> > >> sgx_release+0x11c/0x130
> > >> __fput+0xb0/0x280
> > >> ____fput+0xe/0x10
> > >> task_work_run+0x6c/0xc0
> > >> exit_to_user_mode_prepare+0x1eb/0x1f0
> > >> syscall_exit_to_user_mode+0x1d/0x50
> > >> do_syscall_64+0x46/0xb0
> > >> entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >>
> > >>  arch/x86/kernel/cpu/sgx/encl.c | 1 +
> > >>  1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/arch/x86/kernel/cpu/sgx/encl.c
> > >> b/arch/x86/kernel/cpu/sgx/encl.c
> > >> index 001808e3901c..ab2b79327a8a 100644
> > >> --- a/arch/x86/kernel/cpu/sgx/encl.c
> > >> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > >> @@ -410,6 +410,7 @@ void sgx_encl_release(struct kref *ref)
> > >>                 }
> > >>  
> > >>                 kfree(entry);
> > >> +               cond_resched();
> > >>         }
> > >>  
> > >>         xa_destroy(&encl->page_array);
> > > 
> > > I'd add a comment, e.g.
> > > 
> > > /* Invoke scheduler to prevent soft lockups. */
> > 
> > I could do that. I would like to point out though that there are already
> > six other usages of cond_resched() in the driver and it does indeed
> > seem to be the common pattern. When adding this comment to the now
> > seventh usage it would be the first comment documenting the usage of
> > cond_resched() in the driver.
> > 
> > > 
> > > Other than that makes sense.
> > 
> > Thank you very much for taking a look.
> 
> Well, I believe in inline comments to evolution. As in here it was missing,
> a reminder makes sense.

E.g. there gazillion uses of kmalloc() in kernel but still not all of them
have a comment bound to them...

BR, Jarkko



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux