On Thu, Jun 03, 2021, Jarkko Sakkinen wrote: > On Wed, Jun 02, 2021 at 11:36:43AM +0800, Du Cheng wrote: > > Hi, > > > > I like to report a bug on my linux box running the mainline linux of version: > > commit 8124c8a6b35386f73523d27eacb71b5364a68c4c tag: v5.13-rc4 > > > > After it boots on my intel NUC, I encounter this error in the console log, I > > believe it is triggered by a WARN_ON(): > > > > [ 0.628094] sgx: EPC section 0x30200000-0x35f7ffff > > [ 0.628503] ------------[ cut here ]------------ > > [ 0.628506] WARNING: CPU: 6 PID: 127 at arch/x86/kernel/cpu/sgx/main.c:428 ksgxd+0x1c8/0x1e0 > > > > > > I have attached my config file with which I compiled the kernel, just in case it is helpful. > > > > I am running on ubuntu 21.04 with mainline kernel, and my box is intel NUC: > > > > Product Name: NUC10i5FNH > > SKU Number: BXNUC10i5FNH > > Product Name: NUC10i5FNB > > Is it possible to test with 5.12? > > Linux does not support that hardware, except for KVM VM's, which was > added in 5.13. I'm pretty sure that the issue is kthread_stop() being called on ksgxd before __sgx_sanitize_pages() completes, and that lack of launch control is what is exposing the bug. Prior to adding KVM support, sgx_init() bailed immediately because X86_FEATURE_SGX was cleared if X86_FEATURE_SGX_LC was unsupported. With KVM support, sgx_drv_init() handles the X86_FEATURE_SGX_LC check manually, so now there's any easy-to-hit case where sgx_init() will spawn ksgxd and _then_ fails to initialize, which results in sgx_init() stopping ksgxd before it finishes sanitizing the EPC. The bug existed before KVM support, it was just much harder to hit because it basically required char device registration to fail. This should suppress the WARN if ksgxd is stopped early. diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..bdf31ddfb10d 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -425,7 +425,7 @@ static int ksgxd(void *p) __sgx_sanitize_pages(&sgx_dirty_page_list); /* sanity check: */ - WARN_ON(!list_empty(&sgx_dirty_page_list)); + WARN_ON(!list_empty(&sgx_dirty_page_list) && !kthread_should_stop()); while (!kthread_should_stop()) { if (try_to_freeze()) If that works, then Fixes: e7e0545299d8 ("x86/sgx: Initialize metadata for Enclave Page Cache (EPC) sections") is probably most appropriate.