On Wed, Jan 26, 2022 at 04:29:12PM +0200, Jarkko Sakkinen wrote: > On Thu, Jan 20, 2022 at 08:28:36AM -0800, Reinette Chatre wrote: > > Hi Jarkko, > > > > On 1/20/2022 5:01 AM, Jarkko Sakkinen wrote: > > > On Tue, 2022-01-18 at 11:14 -0800, Reinette Chatre wrote: > > >> Vijay reported that the "unclobbered_vdso_oversubscribed" selftest > > >> triggers the softlockup detector. > > >> > > >> Actual SGX systems have 128GB of enclave memory or more. The > > >> "unclobbered_vdso_oversubscribed" selftest creates one enclave which > > >> consumes all of the enclave memory on the system. Tearing down such a > > >> large enclave takes around a minute, most of it in the loop where > > >> the EREMOVE instruction is applied to each individual 4k enclave > > >> page. > > >> > > >> Spending one minute in a loop triggers the softlockup detector. > > >> > > >> Add a cond_resched() to give other tasks a chance to run and placate > > >> the softlockup detector. > > >> > > >> Cc: stable@xxxxxxxxxxxxxxx > > >> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer") > > >> Reported-by: Vijay Dhanraj <vijay.dhanraj@xxxxxxxxx> > > >> Acked-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > >> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx> > > >> --- > > >> Softlockup message: > > >> watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502] > > >> Kernel panic - not syncing: softlockup: hung tasks > > >> <snip> > > >> sgx_encl_release+0x86/0x1c0 > > >> sgx_release+0x11c/0x130 > > >> __fput+0xb0/0x280 > > >> ____fput+0xe/0x10 > > >> task_work_run+0x6c/0xc0 > > >> exit_to_user_mode_prepare+0x1eb/0x1f0 > > >> syscall_exit_to_user_mode+0x1d/0x50 > > >> do_syscall_64+0x46/0xb0 > > >> entry_SYSCALL_64_after_hwframe+0x44/0xae > > >> > > >> arch/x86/kernel/cpu/sgx/encl.c | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >> diff --git a/arch/x86/kernel/cpu/sgx/encl.c > > >> b/arch/x86/kernel/cpu/sgx/encl.c > > >> index 001808e3901c..ab2b79327a8a 100644 > > >> --- a/arch/x86/kernel/cpu/sgx/encl.c > > >> +++ b/arch/x86/kernel/cpu/sgx/encl.c > > >> @@ -410,6 +410,7 @@ void sgx_encl_release(struct kref *ref) > > >> } > > >> > > >> kfree(entry); > > >> + cond_resched(); > > >> } > > >> > > >> xa_destroy(&encl->page_array); > > > > > > I'd add a comment, e.g. > > > > > > /* Invoke scheduler to prevent soft lockups. */ > > > > I could do that. I would like to point out though that there are already > > six other usages of cond_resched() in the driver and it does indeed > > seem to be the common pattern. When adding this comment to the now > > seventh usage it would be the first comment documenting the usage of > > cond_resched() in the driver. > > > > > > > > Other than that makes sense. > > > > Thank you very much for taking a look. > > Well, I believe in inline comments to evolution. As in here it was missing, > a reminder makes sense. E.g. there gazillion uses of kmalloc() in kernel but still not all of them have a comment bound to them... BR, Jarkko