On Thu, May 21, 2020 at 11:58:02PM -0700, Sean Christopherson wrote: > > + kref_put(&encl_page->encl->refcount, sgx_encl_release); > > + > > + spin_lock(&sgx_active_page_list_lock); > > + list_add_tail(&epc_page->list, &sgx_active_page_list); > > + spin_unlock(&sgx_active_page_list_lock); > > Ugh, this is wrong. If the above kref_put() drops the last reference and > releases the enclave, adding the page to the active page list will result > in a use-after-free as the enclave will have been freed. It also leaks the > EPC page because sgx_encl_destroy() skips pages that are in the process of > being reclaimed (as detected by list_empty()). > > The "original" code did the put() after list_add_tail(), but was moved in > v15 to fix a bug where the put() could drop a reference to the wrong enclave > if the page was freed and reallocated by a different CPU between > list_add_tail() and put(). But, that particular bug only occurred because > the code at the time was: > > sgx_encl_page_put(epc_page); > > I.e. the backpointer in epc_page was consumed after dropping the spin lock. > So long as epc_page->owner (well, epc_page in general) isn't dereferenced, > I'm 99% certain this can be fixed simply by doing kref_put() after moving > the page back to the active page list. Yes. It is certainly a regression to not call it after sgx_active_page_list. That was a good catch, thanks. v31: * Unset SGX_ENCL_IOCTL in the error path of checking encl->flags in order to prevent leaving it set and thus block any further ioctl calls. * Added missing cleanup_srcu_struct() call to sgx_encl_release(). * Take encl->lock in sgx_encl_add_page() in order to prevent races with the page reclaimer. * Fix a use-after-free bug from page reclaimer. Call kref_put() for the encl->refcount only after putting enclave page back to the active page list because it could be the last ref to the enclave. I'm ready to send a new version of the patch set once there is a conclusion with the sigstruct vendor field. /Jarkko