On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote: > Hi Jarkko and Haitao, > > On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote: > > From: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx> > > > > For EMODT and EREMOVE ioctl()'s with a large range, kernel > > may not finish in one shot and return EAGAIN error code > > and count of bytes of EPC pages on that operations are > > finished successfully. > > > > Change the unclobbered_vdso_oversubscribed_remove test > > to rerun the ioctl()'s in a loop, updating offset and length > > using the byte count returned in each iteration. > > > > Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test") > > Should this patch be moved to the "critical fixes for v6.0" series? I think not because it does not risk stability of the kernel itself. It's "nice to have" but not mandatory. > > > Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx> > > Tested-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > > Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > > --- > > v3: > > * Added a fixes tag. The bug is in v6.0 patches. > > * Added my tested-by (the bug reproduced in my NUC often). > > v2: > > * Changed branching in EAGAIN condition so that else branch > > is not required. > > * Addressed Reinette's feedback: > > --- > > tools/testing/selftests/sgx/main.c | 42 ++++++++++++++++++++++++------ > > 1 file changed, 34 insertions(+), 8 deletions(-) > > > > diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c > > index 9820b3809c69..59cca806eda1 100644 > > --- a/tools/testing/selftests/sgx/main.c > > +++ b/tools/testing/selftests/sgx/main.c > > @@ -390,6 +390,7 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > > struct encl_segment *heap; > > unsigned long total_mem; > > int ret, errno_save; > > + unsigned long count; > > unsigned long addr; > > unsigned long i; > > > > @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > > modt_ioc.offset = heap->offset; > > modt_ioc.length = heap->size; > > modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; > > - > > + count = 0; > > TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", > > heap->size); > > - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > - errno_save = ret == -1 ? errno : 0; > > + do { > > + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > + > > + errno_save = ret == -1 ? errno : 0; > > + if (errno_save != EAGAIN) > > + break; > > + > > + EXPECT_EQ(modt_ioc.result, 0); > > If this check triggers then there is something seriously wrong and in that case > it may also be that this loop may be unable to terminate or the error condition would > keep appearing until the loop terminates (which may be many iterations). Considering > the severity and risk I do think that ASSERT_EQ() would be more appropriate, > similar to how ASSERT_EQ() is used in patch 5/5. > > Apart from that I think that this looks good. > > Thank you very much for adding this. > > Reinette Hmm... I could along the lines: /* * Get time since Epoch is milliseconds. */ unsigned long get_time(void) { struct timeval start; gettimeofday(&start, NULL); return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L; } and #define IOCTL_RETRY_TIMEOUT 100 In the test function: unsigned long start_time; /* ... */ start_time = get_time(); do { EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT); /* ... */ } /* ... */ What do you think? BR, Jarkko