On Wed, Jun 16, 2021 at 11:41:19AM -0700, Andy Lutomirski wrote: > mmgrab() and mmdrop() would be better if they were not full barriers. As a > trivial optimization, > mmgrab() could use a relaxed atomic and mmdrop() > could use a release on architectures that have these operations. mmgrab() *is* relaxed, mmdrop() is a full barrier but could trivially be made weaker once membarrier stops caring about it. static inline void mmdrop(struct mm_struct *mm) { unsigned int val = atomic_dec_return_release(&mm->mm_count); if (unlikely(!val)) { /* Provide REL+ACQ ordering for free() */ smp_acquire__after_ctrl_dep(); __mmdrop(mm); } } It's slightly less optimal for not being able to use the flags from the decrement. Or convert the whole thing to refcount_t (if appropriate) which already does something like the above.