* Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > This is a serious cause of mmap_sem contention. MAP_POPULATE > and MCL_FUTURE, in particular, are disastrous in multithreaded programs. > > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx> > --- > > Changes from v1: > > The non-unlocking versions of do_mmap_pgoff and mmap_region are still > available for aio_setup_ring's benefit. In theory, aio_setup_ring > would do better with a lock-downgrading version, but that would be > somewhat ugly and doesn't help my workload. > > arch/tile/mm/elf.c | 9 +++--- > fs/aio.c | 4 +++ > include/linux/mm.h | 19 ++++++++++-- > ipc/shm.c | 6 ++-- > mm/fremap.c | 10 ++++-- > mm/mmap.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++------ > mm/util.c | 3 +- > 7 files changed, 117 insertions(+), 23 deletions(-) > +unsigned long mmap_region(struct file *file, unsigned long addr, > + unsigned long len, unsigned long flags, > + vm_flags_t vm_flags, unsigned long pgoff) > +{ > + return mmap_region_helper(file, addr, len, flags, vm_flags, pgoff, 0); > +} > + That 0 really wants to be NULL ... Also, with your patch applied there's no user of mmap_region() left anymore. More fundamentally, while I agree with the optimization, couldn't we de-uglify it a bit more? In particular, instead of this wrappery: > +unsigned long mmap_region_unlock(struct file *file, unsigned long addr, > + unsigned long len, unsigned long flags, > + vm_flags_t vm_flags, unsigned long pgoff) > +{ > + int downgraded = 0; > + unsigned long ret = mmap_region_helper(file, addr, len, > + flags, vm_flags, pgoff, &downgraded); > + > + if (downgraded) > + up_read(¤t->mm->mmap_sem); > + else > + up_write(¤t->mm->mmap_sem); > + > + return ret; > +} 1) We could at minimum wrap up the conditional unlocking as: up_read_write(&mm->mmap_sem, read_locked); With that I'd also suggest to rename 'downgraded' to 'read_locked', which more clearly expresses the locking state. 2) More aggressively, we could just make it the _rule_ that the mm lock gets downgraded to read in mmap_region_helper(), no matter what. >From a quick look I *think* all the usage sites (including sys_aio_setup()) are fine with that unlocking - but I could be wrong. There's a couple of shorter codepaths that would now see an extra op of downgrading: down_write(&mm->mmap_sem); ... downgrade_write(&mm->mmap_sem); ... up_read(&mm->mmap_sem); with not much work done with the lock read-locked - but I think they are all fine and mostly affect error paths. So there's no real value in keeping the conditional nature of the unlocking I think. That way all the usage sites could do a *much* cleaner pattern of: down_write(&mm->mmap_sem); ... do_mmap_pgoff_downgrade_write(...); ... up_read(&mm->mmap_sem); ... and that kind of cleanliness would instantly halve the size of your patch, it would improve all use sites, and would turn your patch into something I'd want to see applied straight away. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>