On Thu, Sep 01, 2022 at 03:34:10PM -0700, Reinette Chatre wrote: > Hi Jarkko, > > On 9/1/2022 2:53 PM, Jarkko Sakkinen wrote: > > On Wed, Aug 31, 2022 at 01:39:53PM -0700, Reinette Chatre wrote: > >> On 8/31/2022 10:38 AM, Jarkko Sakkinen wrote: > > >> I think I am missing something here. A lot of logic is added here but I > >> do not see why it is necessary. ksgxd() knows via kthread_should_stop() if > >> the reclaimer was canceled. I am thus wondering, could the above not be > >> simplified to something similar to V1: > >> > >> @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void) > >> > >> static int ksgxd(void *p) > >> { > >> + unsigned long left_dirty; > >> + > >> set_freezable(); > >> > >> /* > >> @@ -395,10 +402,10 @@ static int ksgxd(void *p) > >> * required for SECS pages, whose child pages blocked EREMOVE. > >> */ > >> __sgx_sanitize_pages(&sgx_dirty_page_list); > > > > IMHO, would make sense also to have here: > > > > if (!kthread_should_stop()) > > return 0; > > > > Would this not prematurely stop the thread when it should not be? > > >> - __sgx_sanitize_pages(&sgx_dirty_page_list); > >> > >> - /* sanity check: */ > >> - WARN_ON(!list_empty(&sgx_dirty_page_list)); > >> + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list); > >> + if (left_dirty && !kthread_should_stop()) > >> + pr_err("%lu unsanitized pages\n", left_dirty); > > > > That would be incorrect, if the function returned > > because of kthread stopped. > > > I should have highlighted this but in my example I changed > left_dirty to be "unsigned long" with the intention that the > "return -ECANCELED" is replaced with "return 0". It wasn't supposed to be, it's an error. Thanks for spotting that. > > __sgx_sanitize_pages() returns 0 when it exits because of > kthread stopped. > > To elaborate I was thinking about: > > +static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list) > { > + unsigned long left_dirty = 0; > struct sgx_epc_page *page; > LIST_HEAD(dirty); > int ret; > > - /* dirty_page_list is thread-local, no need for a lock: */ > while (!list_empty(dirty_page_list)) { > if (kthread_should_stop()) > - return; > + return 0; > > page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); > > @@ -92,12 +95,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) > } else { > /* The page is not yet clean - move to the dirty list. */ > list_move_tail(&page->list, &dirty); > + left_dirty++; > } > > cond_resched(); > } > > list_splice(&dirty, dirty_page_list); > + return left_dirty; > } > > > and then with what I had in previous email the checks should work: > > @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void) > > static int ksgxd(void *p) > { > + unsigned long left_dirty; > + > set_freezable(); > > /* > @@ -395,10 +402,10 @@ static int ksgxd(void *p) > * required for SECS pages, whose child pages blocked EREMOVE. > */ > __sgx_sanitize_pages(&sgx_dirty_page_list); > - __sgx_sanitize_pages(&sgx_dirty_page_list); > > - /* sanity check: */ > - WARN_ON(!list_empty(&sgx_dirty_page_list)); > + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list); > + if (left_dirty && !kthread_should_stop()) > + pr_err("%lu unsanitized pages\n", left_dirty); > > while (!kthread_should_stop()) { > if (try_to_freeze()) > > > > > > If you do the check here you already have a window > > where kthread could have been stopped anyhow. > > > > So even this would be less correct: > > > > if (kthreas_should_stop()) { > > return 0; > > } else if (left_dirty) { > > pr_err("%lu unsanitized pages\n", left_dirty); > > } > > > > So in the end you end as complicated and less correct > > fix. This all is explained in the commit message. > > > > If you unconditionally print error, you don't have > > a meaning for the number of unsanitized pags. > > Understood that the goal is to only print the > number of unsanitized pages if ksgxd has not been > stopped prematurely. Yeah, and thus give as useful information for sysadmin/developer as we can. BR, Jarkko