Re: [PATCH] x86/sgx: Avoid softlockup from sgx_vepc_release

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 16, 2023 at 10:34 PM Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote:
>
> On Tue Aug 15, 2023 at 2:58 AM EEST, Huang, Kai wrote:
> > On Mon, 2023-08-14 at 06:37 +0200, Jack Wang wrote:
> > > We hit softlocup with following call trace:
> > >  kernel: [ 2041.288210] Call Trace:
> > > kernel: [ 2041.288213]  <IRQ>
> > > kernel: [ 2041.288216]  ? watchdog_timer_fn+0x1b4/0x210
> > > kernel: [ 2041.288222]  ? lockup_detector_update_enable+0x50/0x50
> > > kernel: [ 2041.288225]  ? __hrtimer_run_queues+0x127/0x280
> > > kernel: [ 2041.288230]  ? hrtimer_interrupt+0xfc/0x210
> > > kernel: [ 2041.288233]  ? __sysvec_apic_timer_interrupt+0x59/0xd0
> > > kernel: [ 2041.288237]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> > > kernel: [ 2041.288242]  </IRQ>
> > > kernel: [ 2041.288242]  <TASK>
> > > kernel: [ 2041.288244]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> > > kernel: [ 2041.288250]  xa_erase+0x21/0xb0
> > > kernel: [ 2041.288256]  ? sgx_free_epc_page+0x20/0x50
> > > kernel: [ 2041.288260]  sgx_vepc_release+0x75/0x220
> > > kernel: [ 2041.288263]  __fput+0x89/0x250
> > > kernel: [ 2041.288269]  task_work_run+0x59/0x90
> > > kernel: [ 2041.288275]  do_exit+0x337/0x9a0
> > >
> > > Similar like 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves")
> > > The test system has 64GB of enclave memory, and all assigned to a single
> > > VM. Release vepc take longer time and triggers the softlockup warning.
> > >
> > > Add a cond_resched() to give other tasks a chance to run and placate
> > > the softlockup detector.
> > >
> > > Cc: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > > Cc: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> > > Reported-by: Yu Zhang <yu.zhang@xxxxxxxxx>
> > >
> > > Signed-off-by: Jack Wang <jinpu.wang@xxxxxxxxx>
> > > ---
> > >  arch/x86/kernel/cpu/sgx/virt.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> > > index c3e37eaec8ec..01d2795792cc 100644
> > > --- a/arch/x86/kernel/cpu/sgx/virt.c
> > > +++ b/arch/x86/kernel/cpu/sgx/virt.c
> > > @@ -204,6 +204,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
> > >                     continue;
> > >
> > >             xa_erase(&vepc->page_array, index);
> > > +           cond_resched();
> > >     }
> > >
> > >     /*
> > > @@ -222,6 +223,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
> > >                     list_add_tail(&epc_page->list, &secs_pages);
> > >
> > >             xa_erase(&vepc->page_array, index);
> > > +           cond_resched();
> > >     }
> > >
> > >     /*
> >
> > This adds cond_resched() to the first two loops: releasing non-SECS pages and
> > releasing SECS pages which don't have children on remote vepc instance(s).
> > There is a third round loop to release SECS pages which have children on remote
> > vepc instance(s).  I think it's better to add cond_resched() there too, although
> > it's extremely unlikely it could trigger softlockup w/o cond_resched().
> >
> > Also similar to 8795359e35bc ("x86/sgx: Silence softlockup detection when
> > releasing large enclaves"), this looks stable-kernel material.  So perhaps add:
> >
> > Cc: stable@xxxxxxxxxxxxxxx
>
> + would not hurt to have also fixes tag for clarity.
>
> BR, Jarkko

I did it in v2 here,

https://lore.kernel.org/linux-sgx/3b93dfa438fb2f42f917393c3752ffc2dec35e52.camel@xxxxxxxxx/T/#t

Thx for the review!


>




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux