Re: [PATCH] x86/sgx: Silence softlockup detection when releasing large enclaves

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 20, 2022 at 08:28:36AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/20/2022 5:01 AM, Jarkko Sakkinen wrote:
> > On Tue, 2022-01-18 at 11:14 -0800, Reinette Chatre wrote:
> >> Vijay reported that the "unclobbered_vdso_oversubscribed" selftest
> >> triggers the softlockup detector.
> >>
> >> Actual SGX systems have 128GB of enclave memory or more.  The
> >> "unclobbered_vdso_oversubscribed" selftest creates one enclave which
> >> consumes all of the enclave memory on the system. Tearing down such a
> >> large enclave takes around a minute, most of it in the loop where
> >> the EREMOVE instruction is applied to each individual 4k enclave
> >> page.
> >>
> >> Spending one minute in a loop triggers the softlockup detector.
> >>
> >> Add a cond_resched() to give other tasks a chance to run and placate
> >> the softlockup detector.
> >>
> >> Cc: stable@xxxxxxxxxxxxxxx
> >> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> >> Reported-by: Vijay Dhanraj <vijay.dhanraj@xxxxxxxxx>
> >> Acked-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> >> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> >> ---
> >> Softlockup message:
> >> watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502]
> >> Kernel panic - not syncing: softlockup: hung tasks
> >> <snip>
> >> sgx_encl_release+0x86/0x1c0
> >> sgx_release+0x11c/0x130
> >> __fput+0xb0/0x280
> >> ____fput+0xe/0x10
> >> task_work_run+0x6c/0xc0
> >> exit_to_user_mode_prepare+0x1eb/0x1f0
> >> syscall_exit_to_user_mode+0x1d/0x50
> >> do_syscall_64+0x46/0xb0
> >> entry_SYSCALL_64_after_hwframe+0x44/0xae
> >>
> >>  arch/x86/kernel/cpu/sgx/encl.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/arch/x86/kernel/cpu/sgx/encl.c
> >> b/arch/x86/kernel/cpu/sgx/encl.c
> >> index 001808e3901c..ab2b79327a8a 100644
> >> --- a/arch/x86/kernel/cpu/sgx/encl.c
> >> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> >> @@ -410,6 +410,7 @@ void sgx_encl_release(struct kref *ref)
> >>                 }
> >>  
> >>                 kfree(entry);
> >> +               cond_resched();
> >>         }
> >>  
> >>         xa_destroy(&encl->page_array);
> > 
> > I'd add a comment, e.g.
> > 
> > /* Invoke scheduler to prevent soft lockups. */
> 
> I could do that. I would like to point out though that there are already
> six other usages of cond_resched() in the driver and it does indeed
> seem to be the common pattern. When adding this comment to the now
> seventh usage it would be the first comment documenting the usage of
> cond_resched() in the driver.
> 
> > 
> > Other than that makes sense.
> 
> Thank you very much for taking a look.

Well, I believe in inline comments to evolution. As in here it was missing,
a reminder makes sense.

/Jarkko



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux