On Mon 17-10-16 14:33:53, Laurent Dufour wrote: > Hi all, > > I'm sorry to resurrect this topic, but with the increasing number of > CPUs, this becomes more frequent that the mmap_sem is a bottleneck > especially between the page fault handling and the other threads memory > management calls. > > In the case I'm seeing, there is a lot of page fault occurring while > other threads are trying to manipulate the process memory layout through > mmap/munmap. > > There is no *real* conflict between these operations, the page fault are > done a different page and areas that the one addressed by the mmap/unmap > operations. Thus threads are dealing with different part of the > process's memory space. However since page fault handlers and mmap/unmap > operations grab the mmap_sem, the page fault handling are serialized > with the mmap operations, which impact the performance on large system. Could you quantify how much overhead are we talking about here? > For the record, the page fault are done while reading data from a file > system, and I/O are really impacted by this serialization when dealing > with a large number of parallel threads, in my case 192 threads (1 per > online CPU). But the source of the page fault doesn't really matter I guess. But we are dropping the mmap_sem for the IO and retry the page fault. I am not sure I understood you correctly here though. > I took time trying to figure out how to get rid of this bottleneck, but > this is definitively too complex for me. > I read this mailing history, and some LWN articles about that and my > feeling is that there is no clear way to limit the impact of this > semaphore. Last discussion on this topic seemed to happen last march > during the LSFMM submit (https://lwn.net/Articles/636334/). But this > doesn't seem to have lead to major changes, or may be I missed them. At least mmap/munmap write lock contention could be reduced by the above proposed range locking. Jan Kara has implemented a prototype [1] of the lock for mapping which could be used for mmap_sem as well) but it had some perfomance implications AFAIR. There wasn't a strong usecase for this so far. If there is one, please describe it and we can think what to do about it. There were also some attempts to replace mmap_sem by RCU AFAIR but my vague recollection is that they had some issues as well. [1] http://linux-kernel.2935.n7.nabble.com/PATCH-0-6-RFC-Mapping-range-lock-td592872.html -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>