Re: [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

"Jarkko Sakkinen" <jarkko@xxxxxxxxxx> · Mon, 29 Apr 2024 16:11:03 +0300

On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote:
> Two enclave threads may try to add and remove the same enclave page
> simultaneously (e.g., if the SGX runtime supports both lazy allocation
> and `MADV_DONTNEED` semantics). Consider this race:
>
> 1. T1 performs page removal in sgx_encl_remove_pages() and stops right
>    after removing the page table entry and right before re-acquiring the
>    enclave lock to EREMOVE and xa_erase(&encl->page_array) the page.
> 2. T2 tries to access the page, and #PF[not_present] is raised. The
>    condition to EAUG in sgx_vma_fault() is not satisfied because the
>    page is still present in encl->page_array, thus the SGX driver
>    assumes that the fault happened because the page was swapped out. The
>    driver continues on a code path that installs a page table entry
>    *without* performing EAUG.
> 3. The enclave page metadata is in inconsistent state: the PTE is
>    installed but there was no EAUG. Thus, T2 in userspace infinitely
>    receives SIGSEGV on this page (and EACCEPT always fails).
>
> Fix this by making sure that T1 (the page-removing thread) always wins
> this data race. In particular, the page-being-removed is marked as such,
> and T2 retries until the page is fully removed.
>
> Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@xxxxxxxxx>
> ---
>  arch/x86/kernel/cpu/sgx/encl.c  | 3 ++-
>  arch/x86/kernel/cpu/sgx/encl.h  | 3 +++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
>  3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 41f14b1a3025..7ccd8b2fce5f 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -257,7 +257,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
>  
>  	/* Entry successfully located. */
>  	if (entry->epc_page) {
> -		if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
> +		if (entry->desc & (SGX_ENCL_PAGE_BEING_RECLAIMED |
> +				   SGX_ENCL_PAGE_BEING_REMOVED))
>  			return ERR_PTR(-EBUSY);
>  
>  		return entry;
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index f94ff14c9486..fff5f2293ae7 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -25,6 +25,9 @@
>  /* 'desc' bit marking that the page is being reclaimed. */
>  #define SGX_ENCL_PAGE_BEING_RECLAIMED	BIT(3)
>  
> +/* 'desc' bit marking that the page is being removed. */
> +#define SGX_ENCL_PAGE_BEING_REMOVED	BIT(2)
> +
>  struct sgx_encl_page {
>  	unsigned long desc;
>  	unsigned long vm_max_prot_bits:8;
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index b65ab214bdf5..c542d4dd3e64 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -1142,6 +1142,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
>  		 * Do not keep encl->lock because of dependency on
>  		 * mmap_lock acquired in sgx_zap_enclave_ptes().
>  		 */
> +		entry->desc |= SGX_ENCL_PAGE_BEING_REMOVED;
>  		mutex_unlock(&encl->lock);
>  
>  		sgx_zap_enclave_ptes(encl, addr);

It is somewhat trivial to NAK this as the commit message does
not do any effort describing the new flag. By default at least
I have strong opposition against any new flags related to
reclaiming even if it needs a bit of extra synchronization
work in the user space.

One way to describe concurrency scenarios would be to take
example from https://www.kernel.org/doc/Documentation/memory-barriers.txt

I.e. see the examples with CPU 1 and CPU 2.

BR, Jarkko