Re: [PATCH v4] x86/sgx: Do not consider unsanitized pages an error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 26, 2022 at 02:51:15PM +0200, Paul Menzel wrote:
> Dear Jarkko,
> 
> 
> Thank you for the patch.

No, thank you for reporting this and all the help with testing
the change :-)

> Am 26.08.22 um 03:41 schrieb Jarkko Sakkinen:
> > In sgx_init(), if misc_register() for the provision device fails, and
> > neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
> > prematurely stopped.
> > 
> > This triggers WARN_ON() because sgx_dirty_page_list ends up being
> > non-empty. Ultimately this can crash the kernel, depending on the kernel
> > command line, which is not correct behavior because SGX driver is not
> > working incorrectly.
> 
> Maybe paste the WARN_ON trace, so `git log` can be searched for the trace
> too.

It's a decent suggestion, I agree.

It would be probably also good to mention /proc/sys/kernel/panic_on_warn.
When you set that to '1' WARN() will tear down the whole kernel.

> 
> > Print simple warning instead, and improve the output by printing the
> > number of unsanitized pages.
> 
> See below, but no warning seems to be logged in my case now. (I should test
> Linus’ current master too.)
> 
> > Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@xxxxxxxxxx/T/#u
> > Reported-by: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> > Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list")
> > Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > ---
> > Cc: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> > Cc: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> > 
> > v4:
> > - Explain expectations for dirty_page_list in the function header, instead
> >    of an inline comment.
> > - Improve commit message to explain the conditions better.
> > - Return the number of pages left dirty to ksgxd() and print warning after
> >    the 2nd call, if there are any.
> > 
> > v3:
> > - Remove WARN_ON().
> > - Tuned comments and the commit message a bit.
> > 
> > v2:
> > - Replaced WARN_ON() with optional pr_info() inside
> >    __sgx_sanitize_pages().
> > - Rewrote the commit message.
> > - Added the fixes tag.
> > ---
> >   arch/x86/kernel/cpu/sgx/main.c | 19 +++++++++++++------
> >   1 file changed, 13 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index 515e2a5f25bb..903100fcfce3 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -49,17 +49,20 @@ static LIST_HEAD(sgx_dirty_page_list);
> >    * Reset post-kexec EPC pages to the uninitialized state. The pages are removed
> >    * from the input list, and made available for the page allocator. SECS pages
> >    * prepending their children in the input list are left intact.
> > + *
> > + * Contents of the @dirty_page_list must be thread-local, i.e.
> > + * not shared by multiple threads.
> >    */
> > -static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
> > +static int __sgx_sanitize_pages(struct list_head *dirty_page_list)
> >   {
> >   	struct sgx_epc_page *page;
> > +	int left_dirty = 0;
> >   	LIST_HEAD(dirty);
> >   	int ret;
> > -	/* dirty_page_list is thread-local, no need for a lock: */
> >   	while (!list_empty(dirty_page_list)) {
> >   		if (kthread_should_stop())
> > -			return;
> > +			break;
> >   		page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);
> > @@ -92,12 +95,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
> >   		} else {
> >   			/* The page is not yet clean - move to the dirty list. */
> >   			list_move_tail(&page->list, &dirty);
> > +			left_dirty++;
> >   		}
> >   		cond_resched();
> >   	}
> >   	list_splice(&dirty, dirty_page_list);
> > +	return left_dirty;
> >   }
> >   static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page)
> > @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void)
> >   static int ksgxd(void *p)
> >   {
> > +	int left_dirty;
> > +
> >   	set_freezable();
> >   	/*
> > @@ -395,10 +402,10 @@ static int ksgxd(void *p)
> >   	 * required for SECS pages, whose child pages blocked EREMOVE.
> >   	 */
> >   	__sgx_sanitize_pages(&sgx_dirty_page_list);
> > -	__sgx_sanitize_pages(&sgx_dirty_page_list);
> > -	/* sanity check: */
> > -	WARN_ON(!list_empty(&sgx_dirty_page_list));
> > +	left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > +	if (left_dirty)
> > +		pr_warn("%d unsanitized pages\n", left_dirty);
> >   	while (!kthread_should_stop()) {
> >   		if (try_to_freeze())
> 
> I tested this on top of commit 4c612826bec1 (Merge tag 'net-6.0-rc3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net) and the warning
> trace is gone.
> 
>     [    0.255192] calling  sgx_init+0x0/0x409 @ 1
>     [    0.255207] sgx: EPC section 0x40200000-0x45f7ffff
>     [    0.255747] initcall sgx_init+0x0/0x409 returned -19 after 552 usecs
> 
> (OT: If -19 suggests something failed, a message, why sgx_init() failed
> would be nice.)
> 
> Please find the whole output of `dmesg` attached.

Thanks for testing this!

Hmm... Right, that is interesting observation. Usually driver returns
-ENODEV but since this is initialized by the core I guess we should
actually return 0?

Dave, any thoughts on this?

BR, Jarkko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux