On Thu, Mar 22, 2018 at 09:06:14AM -0700, Yang Shi wrote: > On 3/22/18 2:10 AM, Michal Hocko wrote: > > On Wed 21-03-18 15:36:12, Yang Shi wrote: > > > On 3/21/18 2:23 PM, Michal Hocko wrote: > > > > On Wed 21-03-18 10:16:41, Yang Shi wrote: > > > > > proc_pid_cmdline_read(), it calls access_remote_vm() which need acquire > > > > > mmap_sem too, so the mmap_sem scalability issue will be hit sooner or later. > > > > Ohh, absolutely. mmap_sem is unfortunatelly abused and it would be great > > > > to remove that. munmap should perform much better. How to do that safely > > The full vma will have to be range locked. So there is nothing small or large. > > It sounds not helpful to a single large vma case since just one range lock > for the vma, it sounds equal to mmap_sem. But splitting mmap_sem into pieces is beneficial for this case. Imagine we have a spinlock / rwlock to protect the rbtree / arg_start / arg_end / ... and then each VMA has a rwsem (or equivalent). access_remote_vm() would walk the tree and grab the VMA's rwsem for read while reading out the arguments. The munmap code would have a completely different VMA write-locked.