Re: Splitting the mmap_sem

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Fri, 13 Dec 2019 06:33:33 -0800

On Thu, Dec 12, 2019 at 06:46:13PM +0300, Kirill A. Shutemov wrote:
> On Thu, Dec 12, 2019 at 07:40:02AM -0800, Matthew Wilcox wrote:
> > On Thu, Dec 12, 2019 at 05:24:57PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Dec 03, 2019 at 02:21:47PM -0800, Matthew Wilcox wrote:
> > > > My preferred solution to the mmap_sem scalability problem is to allow
> > > > VMAs to be looked up under the RCU read lock then take a per-VMA lock.
> > > > I've been focusing on the first half of this problem (looking up VMAs
> > > > in an RCU-safe data structure) and ignoring the second half (taking a
> > > > lock while holding the RCU lock).
> > > 
> > > Do you see this approach to be regression-free for uncontended case?
> > > I doubt it will not cause regressions for signle-threaded applications...
> > 
> > Which part of the approach do you think will cause a regression?  The
> > maple tree is quicker to traverse than the rbtree (in our simulations).
> > Incrementing a refcount on a VMA is surely no slower than acquiring an
> > uncontended rwsem for read.  mmap() and munmap() will get slower, but is
> > that a problem?
> 
> Yes, it does. Especially for short-living processes. See kernel build as a
> workload.

Ah.  Well, we can skip the synchronize_rcu() step if the mm_struct has zero
or one mm_users.  That should avoid a slowdown for mmap/munmap.