On Thu, Dec 12, 2019 at 05:24:57PM +0300, Kirill A. Shutemov wrote: > On Tue, Dec 03, 2019 at 02:21:47PM -0800, Matthew Wilcox wrote: > > My preferred solution to the mmap_sem scalability problem is to allow > > VMAs to be looked up under the RCU read lock then take a per-VMA lock. > > I've been focusing on the first half of this problem (looking up VMAs > > in an RCU-safe data structure) and ignoring the second half (taking a > > lock while holding the RCU lock). > > Do you see this approach to be regression-free for uncontended case? > I doubt it will not cause regressions for signle-threaded applications... Which part of the approach do you think will cause a regression? The maple tree is quicker to traverse than the rbtree (in our simulations). Incrementing a refcount on a VMA is surely no slower than acquiring an uncontended rwsem for read. mmap() and munmap() will get slower, but is that a problem? > > We currently only have one ->map_pages() callback, and it's > > filemap_map_pages(). It only needs to sleep in one place -- to allocate > > a PTE table. I think that can be allocated ahead of time if needed. > > No, filemap_map_pages() doesn't sleep. It cannot. Whole body of the > function is under rcu_read_lock(). It uses pre-allocated page table. > See do_fault_around(). Oh, thank you! That makes the ->map_pages() optimisation already workable with no changes.