Re: Support SGX2 V5: Seg-fault with EACCEPT for large number of EPC pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 29, 2022 at 04:01:04PM +0000, Dhanraj, Vijay wrote:
> Hi All,
> 
> I recently tested the V5 version of the patch with Gramine and ran into a seg-fault during EPC allocation that is `EAUG`ing via `EACCEPT`. Allocation worked fine for smaller requests and even up to 2GBs. But when I tried with 4GB allocation I got a seg-fault.
> Huang, Haitao and I created a simple patch to repro this issue using the SGX selftests and we do see the issue when using V5 (5.18.0-rc5) but cannot repro the issue in V4 (5.18.0-rc2). Not sure if this is a driver issue or kernel, can you please check?
> 
> Results with V5 using modified `augment_via_eaccept` test:
> #  RUN           enclave.augment_via_eaccept ...
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 8192, seg->size = 8192
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 12288, seg->size = 4096
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 36864, seg->size = 24576
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 40960, seg->size = 4096
> # main.c:1153:augment_via_eaccept:mmaping pages at end of enclave...
> # main.c:1167:augment_via_eaccept:Entering enclave to run EACCEPT for each page of 8589934592 bytes may take a while ...
> # main.c:1184:augment_via_eaccept:Expected self->run.exception_vector (14) == 0 (0)
> # main.c:1185:augment_via_eaccept:Expected self->run.exception_error_code (4) == 0 (0)
> # main.c:1186:augment_via_eaccept:Expected self->run.exception_addr (140106113478656) == 0 (0)
> # main.c:1188:augment_via_eaccept:Expected self->run.function (3) == EEXIT (4)
> # augment_via_eaccept: Test terminated by assertion
> 
> Results with V4 using modified `augment_via_eaccept` test:
> #  RUN           enclave.augment_via_eaccept ...
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 8192, seg->size = 8192
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 12288, seg->size = 4096
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 36864, seg->size = 24576
> # main.c:1135:augment_via_eaccept:test enclave: total_size = 40960, seg->size = 4096
> # main.c:1153:augment_via_eaccept:mmaping pages at end of enclave...
> # main.c:1167:augment_via_eaccept:Entering enclave to run EACCEPT for each page of 8589934592 bytes may take a while ...
> #            OK  enclave.augment_via_eaccept
> 
> 
> Test Patch:
> diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
> index 94bdeac1cf04..7de1b15c90b1 100644
> --- a/tools/testing/selftests/sgx/load.c
> +++ b/tools/testing/selftests/sgx/load.c
> @@ -171,7 +171,8 @@ uint64_t encl_get_entry(struct encl *encl, const char *symbol)
>  	return 0;
>  }
>  
> -bool encl_load(const char *path, struct encl *encl, unsigned long heap_size)
> +bool encl_load(const char *path, struct encl *encl, unsigned long heap_size,
> +			   unsigned long edmm_size)
>  {
>  	const char device_path[] = "/dev/sgx_enclave";
>  	struct encl_segment *seg;
> @@ -300,7 +301,7 @@ bool encl_load(const char *path, struct encl *encl, unsigned long heap_size)
>  
>  	encl->src_size = encl->segment_tbl[j].offset + encl->segment_tbl[j].size;
>  
> -	for (encl->encl_size = 4096; encl->encl_size < encl->src_size; )
> +	for (encl->encl_size = 4096; encl->encl_size < encl->src_size + edmm_size;)
>  		encl->encl_size <<= 1;
>  
>  	return true;
> diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
> index 9820b3809c69..8d7ce9389c8f 100644
> --- a/tools/testing/selftests/sgx/main.c
> +++ b/tools/testing/selftests/sgx/main.c
> @@ -25,6 +25,8 @@ static const uint64_t MAGIC = 0x1122334455667788ULL;
>  static const uint64_t MAGIC2 = 0x8877665544332211ULL;
>  vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave;
>  
> +static const unsigned long edmm_size = 8589934592; //8G
> +
>  /*
>   * Security Information (SECINFO) data structure needed by a few SGX
>   * instructions (eg. ENCLU[EACCEPT] and ENCLU[EMODPE]) holds meta-data
> @@ -183,7 +185,7 @@ static bool setup_test_encl(unsigned long heap_size, struct encl *encl,
>  	unsigned int i;
>  	void *addr;
>  
> -	if (!encl_load("test_encl.elf", encl, heap_size)) {
> +	if (!encl_load("test_encl.elf", encl, heap_size, edmm_size)) {
>  		encl_delete(encl);
>  		TH_LOG("Failed to load the test enclave.");
>  		return false;
> @@ -1104,14 +1106,19 @@ TEST_F(enclave, augment)
>   * Test for the addition of pages to an initialized enclave via a
>   * pre-emptive run of EACCEPT on page to be added.
>   */
> -TEST_F(enclave, augment_via_eaccept)
> +/*
> + * Test for the addition of pages to an initialized enclave via a
> + * pre-emptive run of EACCEPT on page to be added.
> + */
> +/*TEST_F(enclave, augment_via_eaccept)*/
> +TEST_F_TIMEOUT(enclave, augment_via_eaccept, 900)

Could you make this a proper patch for kselftest?

It looks fine. Only thing you need to do is to add instead a new test add a
new test called augment_via_eaccept_long. Please, also define this before
TEST_F's:

#define TIMEOUT_LONG 900 /* seconds */

Then we can test against a commit instead of attachment and we obviously
want a test case for this, after the issue has been fixed. And I can easily
build a kernel for Geminilake, XPS-20 and Icelake server, and run the
equivalent test.

BR, Jarkko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux