On Tue, Dec 17, 2024 at 2:26 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Mon, Dec 16, 2024 at 11:24:16AM -0800, Suren Baghdasaryan wrote: > > vma_start_read() can temporarily raise vm_refcnt of a write-locked and > > detached vma: > > > > // vm_refcnt==1 (attached) > > vma_start_write() > > vma->vm_lock_seq = mm->mm_lock_seq > > > > vma_start_read() > > vm_refcnt++; // vm_refcnt==2 > > > > vma_mark_detached() > > vm_refcnt--; // vm_refcnt==1 > > > > // vma is detached but vm_refcnt!=0 temporarily > > > > if (vma->vm_lock_seq == mm->mm_lock_seq) > > vma_refcount_put() > > vm_refcnt--; // vm_refcnt==0 > > > > This is currently not a problem when freeing the vma because RCU grace > > period should pass before kmem_cache_free(vma) gets called and by that > > time vma_start_read() should be done and vm_refcnt is 0. However once > > we introduce possibility of vma reuse before RCU grace period is over, > > this will become a problem (reused vma might be in non-detached state). > > Introduce vma_ensure_detached() for the writer to wait for readers until > > they exit vma_start_read(). > > So aside from the lockdep problem (which I think is fixable), the normal > way to fix the above is to make dec_and_test() do the kmem_cache_free(). > > Then the last user does the free and everything just works. I see your point. Let me reply in the other patch where you have more comments about this.