On Wed, Jan 20, 2021 at 05:19:44PM -0800, Dave Hansen wrote: > On 1/20/21 4:29 PM, Jarkko Sakkinen wrote: > > On Wed, Jan 20, 2021 at 09:35:28AM -0800, Sean Christopherson wrote: > >> On Fri, Jan 15, 2021, jarkko@xxxxxxxxxx wrote: > >>> From: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > >>> > >>> The most trivial example of a race condition can be demonstrated with this > >>> example where mm_list contains just one entry: > >>> > >>> CPU A CPU B > >>> sgx_release() > >>> sgx_mmu_notifier_release() > >>> list_del_rcu() > >>> sgx_encl_release() > >>> synchronize_srcu() > >>> cleanup_srcu_struct() > >>> > >>> To fix this, call synchronize_srcu() before checking whether mm_list is > >>> empty in sgx_release(). > >> Why haven't you included the splat that Haitao provided? That would go a long > >> way to helping answer Boris' question about exactly what is broken... > > I've lost the klog. > > Haitao said he thought it was this: > > > void cleanup_srcu_struct(struct srcu_struct *ssp) > > { > ... > > if (WARN_ON(srcu_readers_active(ssp))) > > return; /* Just leak it! */ > > Which would indicate that an 'encl' kref reached 0 while some other > thread was inside a > > idx = srcu_read_lock(&encl->srcu); > ... > srcu_read_unlock(&encl->srcu, idx); > > critical section. A quick audit didn't turn up any obvious suspects, > though. > > If that *is* it, it might be nice to try to catch the culprit at > srcu_read_{un}lock() time. If there's ever a 0 refcount at those sites, > it would be nice to spew a warning: > > idx = srcu_read_lock(&encl->srcu); > WARN_ON(!atomic_read(&encl->refcount->refcount); > ... > WARN_ON(!atomic_read(&encl->refcount->refcount); > srcu_read_unlock(&encl->srcu, idx); > > at each site. The root cause is fully known already and audited. An mm_list entry is kept up until the process exits *or* when VFS close happens, and sgx_release() executes and removes it. It's entirely possible that MMU notifier callback registered by a process happens while sgx_release() is executing, which causes list_del_rcu() to happen, unnoticed by sgx_release(). If that empties the list, cleanup_srcu_struct() is executed unsynchronized in the middle a unfinished grace period. /Jarkko