On Thu, Jan 09, 2020 at 12:13:20PM -0800, Matthew Wilcox wrote: > One of the use cases that we're concerned about involves a high > percentage of page faults on a single large (terabytes) VMA (and a > highly multithreaded process). Moving the contention from a rwsem > in the mm_struct to a refcount in the VMA will not help performance > substantially for this user. This is why I never believed in the VMA-refcount approach, and why my patches used SRCU. > The proposal consists of three phases. In phase 1, we convert the > rbtree to the maple tree, and leave the locking alone. In phase 2, > we change the locking to a per-VMA refcount, looked up under RCU. > > This problem arises during phase 3 where we attempt to handle page > faults entirely under the RCU read lock. If we encounter problems, > we can fall back to acquiring the VMA refcount, but we need the > page allocation to fail rather than sleep (or magically drop the > RCU lock and return an indication that it has done so, but that > doesn't seem to be an approach that would find any favour). So why not use SRCU? You can do full blocking faults under SRCU and don't need no 'stinkin' refcounts ;-)