Re: [PATCH 1/2] x86/sgx: Do not fail on incomplete sanitization on premature stop of ksgxd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 03, 2022 at 09:01:07AM +0300, Jarkko Sakkinen wrote:
> Unsanitized pages trigger WARN_ON() unconditionally, which can panic the
> whole computer, if /proc/sys/kernel/panic_on_warn is set.
> 
> In sgx_init(), if misc_register() fails or misc_register() succeeds but
> neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
> prematurely stopped. This may leave unsanitized pages, which will result a
> false warning.
> 
> Refine __sgx_sanitize_pages() to return:
> 
> 1. Zero when the sanitization process is complete or ksgxd has been
>    requested to stop.
> 2. The number of unsanitized pages otherwise.
> 
> Use the return value as the criteria for triggering output, and tone down
> the output to pr_err() to prevent the whole system to be taken down if for
> some reason sanitization process does not complete.
> 
> Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@xxxxxxxxxx/T/#u
> Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list")
> Cc: stable@xxxxxxxxxxxxxxx # v5.13+
> Reported-by: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> ---
> v7:
> - Rewrote commit message.
> - Do not return -ECANCELED on premature stop. Instead use zero both
>   premature stop and complete sanitization.
> 
> v6:
> - Address Reinette's feedback:
>   https://lore.kernel.org/linux-sgx/Yw6%2FiTzSdSw%2FY%2FVO@xxxxxxxxxx/
> 
> v5:
> - Add the klog dump and sysctl option to the commit message.
> 
> v4:
> - Explain expectations for dirty_page_list in the function header, instead
>   of an inline comment.
> - Improve commit message to explain the conditions better.
> - Return the number of pages left dirty to ksgxd() and print warning after
>   the 2nd call, if there are any.
> 
> v3:
> - Remove WARN_ON().
> - Tuned comments and the commit message a bit.
> 
> v2:
> - Replaced WARN_ON() with optional pr_info() inside
>   __sgx_sanitize_pages().
> - Rewrote the commit message.
> - Added the fixes tag.
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 33 ++++++++++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 515e2a5f25bb..c0a5ce19c608 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -49,17 +49,23 @@ static LIST_HEAD(sgx_dirty_page_list);
>   * Reset post-kexec EPC pages to the uninitialized state. The pages are removed
>   * from the input list, and made available for the page allocator. SECS pages
>   * prepending their children in the input list are left intact.
> + *
> + * Contents of the @dirty_page_list must be thread-local, i.e.
> + * not shared by multiple threads.
> + *
> + * Return 0 when sanitization was successful or kthread was stopped, and the
> + * number of unsanitized pages otherwise.
>   */
> -static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
> +static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list)
>  {
> +	unsigned long left_dirty = 0;
>  	struct sgx_epc_page *page;
>  	LIST_HEAD(dirty);
>  	int ret;
>  
> -	/* dirty_page_list is thread-local, no need for a lock: */
>  	while (!list_empty(dirty_page_list)) {
>  		if (kthread_should_stop())
> -			return;
> +			return 0;
>  
>  		page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);
>  
> @@ -92,12 +98,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
>  		} else {
>  			/* The page is not yet clean - move to the dirty list. */
>  			list_move_tail(&page->list, &dirty);
> +			left_dirty++;
>  		}
>  
>  		cond_resched();
>  	}
>  
>  	list_splice(&dirty, dirty_page_list);
> +	return left_dirty;
>  }
>  
>  static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page)
> @@ -388,17 +396,28 @@ void sgx_reclaim_direct(void)
>  
>  static int ksgxd(void *p)
>  {
> +	unsigned long left_dirty;
> +
>  	set_freezable();
>  
>  	/*
>  	 * Sanitize pages in order to recover from kexec(). The 2nd pass is
>  	 * required for SECS pages, whose child pages blocked EREMOVE.
>  	 */
> -	__sgx_sanitize_pages(&sgx_dirty_page_list);
> -	__sgx_sanitize_pages(&sgx_dirty_page_list);
> +	left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> +	pr_debug("%ld unsanitized pages\n", left_dirty);
                  %lu

>  
> -	/* sanity check: */
> -	WARN_ON(!list_empty(&sgx_dirty_page_list));
> +	left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> +	/*
> +	 * Never expected to happen in a working driver. If it happens the bug
> +	 * is expected to be in the sanitization process, but successfully
> +	 * sanitized pages are still valid and driver can be used and most
> +	 * importantly debugged without issues. To put short, the global state
> +	 * of kernel is not corrupted so no reason to do any more complicated
> +	 * rollback.
> +	 */
> +	if (left_dirty)
> +		pr_err("%ld unsanitized pages\n", left_dirty);
                        %lu

>  
>  	while (!kthread_should_stop()) {
>  		if (try_to_freeze())
> -- 
> 2.37.2
> 

BR, Jarkko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux