Re: Splitting the mmap_sem

"Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> · Thu, 12 Dec 2019 18:46:13 +0300

On Thu, Dec 12, 2019 at 07:40:02AM -0800, Matthew Wilcox wrote:
> On Thu, Dec 12, 2019 at 05:24:57PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Dec 03, 2019 at 02:21:47PM -0800, Matthew Wilcox wrote:
> > > My preferred solution to the mmap_sem scalability problem is to allow
> > > VMAs to be looked up under the RCU read lock then take a per-VMA lock.
> > > I've been focusing on the first half of this problem (looking up VMAs
> > > in an RCU-safe data structure) and ignoring the second half (taking a
> > > lock while holding the RCU lock).
> > 
> > Do you see this approach to be regression-free for uncontended case?
> > I doubt it will not cause regressions for signle-threaded applications...
> 
> Which part of the approach do you think will cause a regression?  The
> maple tree is quicker to traverse than the rbtree (in our simulations).
> Incrementing a refcount on a VMA is surely no slower than acquiring an
> uncontended rwsem for read.  mmap() and munmap() will get slower, but is
> that a problem?

Yes, it does. Especially for short-living processes. See kernel build as a
workload.

-- 
 Kirill A. Shutemov