On Sun, Feb 07, 2021 at 11:29:49PM +0200, Jarkko Sakkinen wrote: > On Fri, Feb 05, 2021 at 11:36:57AM -0800, Dave Hansen wrote: > > On 2/5/21 10:28 AM, Jarkko Sakkinen wrote: > > > This has been shown in tests: > > > > > > [ +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100 > > > > > > There are two functions that drain encl->mm_list: > > > > > > - sgx_release() (i.e. VFS release) removes the remaining mm_list entries. > > > - sgx_mmu_notifier_release() removes mm_list entry for the registered > > > process, if it still exists. > > > > Jarkko, I like your approach. This actually has the potential to be a > > lot more understandable than the fix we settled on before. > > Yeah, it's more like by-the-book use of refcount, each processs gets > a reference. This way things should be always serialized correctly. > > > But I think the explanation needs some tweaking, and I think I can take > > it a step further to make it even more straightforward. The issue here > > isn't *really* mm_list, it's this: > > > > encl_mm->encl = encl; > > Agreed. > > This was also in center of thinking when I did this new patch. > > > That literally establishes a encl_mm to encl reference and needs a > > reference count. That reference remains until 'encl_mm' is freed. I > > don't think mm_list needs to even be taken into account. > > > > The most straightforward way to fix this is to take a refcount at > > "encl_mm->encl = encl" and release it at kfree(encl_mm). That makes a > > *lot* of logical sense to me, and it's also trivial to audit. > > > > Totally untested patch attached (adapted directly from yours). > > I tested this version, and it also seems to work. Boris, can you > pick this refined version from Dave's attachment or do you prefer > that I do a re-send? Nevermind. I'll send a proper patch (just noticed that the attachment did have short summary). /Jarkko