Re: Splitting the mmap_sem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 13, 2019 at 06:33:33AM -0800, Matthew Wilcox wrote:
> On Thu, Dec 12, 2019 at 06:46:13PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Dec 12, 2019 at 07:40:02AM -0800, Matthew Wilcox wrote:
> > > On Thu, Dec 12, 2019 at 05:24:57PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Dec 03, 2019 at 02:21:47PM -0800, Matthew Wilcox wrote:
> > > > > My preferred solution to the mmap_sem scalability problem is to allow
> > > > > VMAs to be looked up under the RCU read lock then take a per-VMA lock.
> > > > > I've been focusing on the first half of this problem (looking up VMAs
> > > > > in an RCU-safe data structure) and ignoring the second half (taking a
> > > > > lock while holding the RCU lock).
> > > > 
> > > > Do you see this approach to be regression-free for uncontended case?
> > > > I doubt it will not cause regressions for signle-threaded applications...
> > > 
> > > Which part of the approach do you think will cause a regression?  The
> > > maple tree is quicker to traverse than the rbtree (in our simulations).
> > > Incrementing a refcount on a VMA is surely no slower than acquiring an
> > > uncontended rwsem for read.  mmap() and munmap() will get slower, but is
> > > that a problem?
> > 
> > Yes, it does. Especially for short-living processes. See kernel build as a
> > workload.
> 
> Ah.  Well, we can skip the synchronize_rcu() step if the mm_struct has zero
> or one mm_users.  That should avoid a slowdown for mmap/munmap.

I may work. But I'm not sure how it will work with remote mm accesses.
Like with /proc/ interfaces or ptrace.

-- 
 Kirill A. Shutemov




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux