On Tue, Jun 15, 2021 at 05:44:58PM -0700, Sean Christopherson wrote: > Don't WARN on having unsanitized EPC pages if ksgxd is stopped early, > e.g. if sgx_init() realizes there will be no downstream consumers of EPC. > If ksgxd is stopped early, EPC pages may be left on the dirty list, but > that's ok because ksgxd is only stopped if SGX initialization failed or > if the kernel is going down. In either case, the EPC won't be used. > > This bug was exposed by the addition of KVM support, but has existed and > was hittable since the original sanitization code was added. Prior to > adding KVM support, if Launch Control was not fully enabled, e.g. when > running on older hardware, sgx_init() bailed immediately before spawning > ksgxd because X86_FEATURE_SGX was cleared if X86_FEATURE_SGX_LC was > unsupported. > > With KVM support, sgx_drv_init() handles the X86_FEATURE_SGX_LC check > manually, so now there's any easy-to-hit case where sgx_init() will spawn > ksgxd and _then_ fail to initialize, which results in sgx_init() stopping > ksgxd before it finishes sanitizing the EPC. > > Prior to KVM support, the bug was much harder to hit because it basically > required char device registration to fail. > > Reported-by: Du Cheng <ducheng2@xxxxxxxxx> > Fixes: e7e0545299d8 ("x86/sgx: Initialize metadata for Enclave Page Cache (EPC) sections") > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > > Lightly tested due to lack of hardware. I hacked the flow to verify that > stopping early will leave work pending, and that rechecking should_stop() > suppress the resulting WARN. > > arch/x86/kernel/cpu/sgx/main.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index ad904747419e..fbad2b9625a5 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -425,7 +425,7 @@ static int ksgxd(void *p) > __sgx_sanitize_pages(&sgx_dirty_page_list); > > /* sanity check: */ > - WARN_ON(!list_empty(&sgx_dirty_page_list)); > + WARN_ON(!list_empty(&sgx_dirty_page_list) && !kthread_should_stop()); > > while (!kthread_should_stop()) { > if (try_to_freeze()) > -- Hmm, this looks weird. Why aren't we starting ksgxd only after *everything* has initialized successfully? I.e., after both kvm and native drivers' init functions have succeeded? Then you won't have to do this kthread_should_stop() thing after the fact. Btw, you have the same thing in the while loop's termination condition two lines down which, if I have to look at it later, would make me scratch head as to what TH is going on here. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette