Re: [BUG] bug report on x86/sgx: ksgxd()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 03, 2021 at 09:37:52PM +0000, Sean Christopherson wrote:
> On Thu, Jun 03, 2021, Jarkko Sakkinen wrote:
> > On Wed, Jun 02, 2021 at 11:36:43AM +0800, Du Cheng wrote:
> > > Hi,
> > > 
> > > I like to report a bug on my linux box running the mainline linux of version:
> > > commit 8124c8a6b35386f73523d27eacb71b5364a68c4c tag: v5.13-rc4
> > > 
> > > After it boots on my intel NUC, I encounter this error in the console log, I
> > > believe it is triggered by a WARN_ON():
> > > 
> > > [    0.628094] sgx: EPC section 0x30200000-0x35f7ffff
> > > [    0.628503] ------------[ cut here ]------------
> > > [    0.628506] WARNING: CPU: 6 PID: 127 at arch/x86/kernel/cpu/sgx/main.c:428 ksgxd+0x1c8/0x1e0
> > > 
> > > 
> > > I have attached my config file with which I compiled the kernel, just in case it is helpful.
> > > 
> > > I am running on ubuntu 21.04 with mainline kernel, and my box is intel NUC:
> > > 
> > > 	Product Name: NUC10i5FNH
> > > 	SKU Number: BXNUC10i5FNH
> > > 	Product Name: NUC10i5FNB
> > 
> > Is it possible to test with 5.12?
> > 
> > Linux does not support that hardware, except for KVM VM's, which was
> > added in 5.13.
> 
> I'm pretty sure that the issue is kthread_stop() being called on ksgxd before
> __sgx_sanitize_pages() completes, and that lack of launch control is what is
> exposing the bug.
> 
> Prior to adding KVM support, sgx_init() bailed immediately because
> X86_FEATURE_SGX was cleared if X86_FEATURE_SGX_LC was unsupported.
> 
> With KVM support, sgx_drv_init() handles the X86_FEATURE_SGX_LC check manually,
> so now there's any easy-to-hit case where sgx_init() will spawn ksgxd and _then_
> fails to initialize, which results in sgx_init() stopping ksgxd before it finishes
> sanitizing the EPC.
> 
> The bug existed before KVM support, it was just much harder to hit because it
> basically required char device registration to fail.
> 
> This should suppress the WARN if ksgxd is stopped early.
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 63d3de02bbcc..bdf31ddfb10d 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -425,7 +425,7 @@ static int ksgxd(void *p)
>         __sgx_sanitize_pages(&sgx_dirty_page_list);
> 
>         /* sanity check: */
> -       WARN_ON(!list_empty(&sgx_dirty_page_list));
> +       WARN_ON(!list_empty(&sgx_dirty_page_list) && !kthread_should_stop());
> 
>         while (!kthread_should_stop()) {
>                 if (try_to_freeze())
> 
> 
> If that works, then
> 
>   Fixes: e7e0545299d8 ("x86/sgx: Initialize metadata for Enclave Page Cache (EPC) sections")
> 
> is probably most appropriate.

Since this could happen theoretically in 5.11, I agree that it's the
commit.

Can you send a proper patch? I can also mangle a patch, if you don't have
the bandwidth.

What you wrote above goes for a commit message.

/Jarkko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux