On Fri, Jan 10, 2025 at 7:31 AM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 1/9/25 3:30 AM, Suren Baghdasaryan wrote: > > To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that > > object reuse before RCU grace period is over will be detected by > > lock_vma_under_rcu(). > > Current checks are sufficient as long as vma is detached before it is > > freed. The only place this is not currently happening is in exit_mmap(). > > Add the missing vma_mark_detached() in exit_mmap(). > > Another issue which might trick lock_vma_under_rcu() during vma reuse > > is vm_area_dup(), which copies the entire content of the vma into a new > > one, overriding new vma's vm_refcnt and temporarily making it appear as > > attached. This might trick a racing lock_vma_under_rcu() to operate on > > a reused vma if it found the vma before it got reused. To prevent this > > situation, we should ensure that vm_refcnt stays at detached state (0) > > when it is copied and advances to attached state only after it is added > > into the vma tree. Introduce vma_copy() which preserves new vma's > > vm_refcnt and use it in vm_area_dup(). Since all vmas are in detached > > state with no current readers when they are freed, lock_vma_under_rcu() > > will not be able to take vm_refcnt after vma got detached even if vma > > is reused. > > Finally, make vm_area_cachep SLAB_TYPESAFE_BY_RCU. This will facilitate > > vm_area_struct reuse and will minimize the number of call_rcu() calls. > > > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx> > > You could also drop the reset_refcnt parameter of vma_lock_init() now, > as the usage in vm_area_dup() should now be just setting 0 over 0. Maybe > a VM_WARN_ON if it's not 0 already? Yeah, that's a good idea. Will do. > And a comment in vm_area_struct definition to consider vma_copy() when > adding any new field? Sure, will add. > > > + /* > > + * src->shared.rb may be modified concurrently, but the clone > > + * will be reinitialized. > > + */ > > + data_race(memcpy(&dest->shared, &src->shared, sizeof(dest->shared))); > > The comment makes it sound as if we didn't need to do it at all? But I > didn't verify. If we do need it in some cases (i.e. the just allocated > vma might have garbage from previous lifetime, but src is well defined > and it's a case where it's not reinitialized afterwards) maybe the > comment should say? Or if it's either reinitialized later or zeroes at > src, we could memset() the zeroes instead of memcpying them, etc. I see vm_area_dup() being used in dup_mmap() and I think this comment is about this usage in case the src vma changes from under us. However, vm_area_dup() is also used when we simply duplicate an existing vma while holding an mmap_write_lock, like in __split_vma(). In these cases there is no possibility of a race and copied value should hold. Maybe I should amend this comment like this: /* * src->shared.rb may be modified concurrently when called from dup_mmap(), * but the clone will reinitialize it. */ WDYT? > >