On Wed, 2013-10-09 at 20:14 -0700, Linus Torvalds wrote: > On Wed, Oct 9, 2013 at 12:28 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > The workload that I got the report from was a virus scanner, it would > > spawn nr_cpus threads and {mmap file, scan content, munmap} through your > > filesystem. > > So I suspect we could make the mmap_sem write area *much* smaller for > the normal cases. > > Look at do_mmap_pgoff(), for example: it is run entirely under > mmap_sem, but 99% of what it does doesn't actually need the lock. > > The part that really needs the lock is > > addr = get_unmapped_area(file, addr, len, pgoff, flags); > addr = mmap_region(file, addr, len, vm_flags, pgoff); > > but we hold it over all the other stuff too. > True. By looking at the callers, we're always doing: down_write(&mm->mmap_sem); do_mmap_pgoff() ... up_write(&mm->mmap_sem); That goes for shm, aio, and of course mmap_pgoff(). While I know you hate two level locking, one way to go about this is to take the lock inside do_mmap_pgoff() after the initial checks (flags, page align, etc.) and return with the lock held, leaving the caller to unlock it. > In fact, even if we moved the mmap_sem down into do_mmap(), and moved > code around a bit to only hold it over those functions, it would still > cover unnecessarily much. For example, while merging is common, not > merging is pretty common too, and we do that > > vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL); > > allocation under the lock. We could easily do things like preallocate > it outside the lock. > AFAICT there are also checks that should be done at the beginning of the function, such as checking for MAP_LOCKED and VM_LOCKED flags before calling get_unmapped_area(). Thanks, Davidlohr -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>