On Tue, Sep 25, 2018 at 12:04:04PM +0300, Leon Romanovsky wrote: > From: Parav Pandit <parav@xxxxxxxxxxxx> > > Currently mmap_sem is read locked while pinning the memory. > In a multi-threaded application of a process, holding mmap_sem lock > creates contention with other threads who might be either registering > memory, creating QPs or simply doing mmap() as such operations also > require to hold the mmap_sem write lock. > > All such operation cannot make forward progress until one memory pin > operation is completed. > It becomes more worse if the memory is unpinned and/or memory > registration is large (in GB range). > > Therefore, instead of holding mmap_sem for too long (for whole region > pinning), acquire and release the lock for every few pages. > For example on x86 with 4K page size, acquire and release mmap_sem for > every 2Mbytes memory chunk. > > This allows other competing threads to make progress who might wish > to hold mmap_sem for shorter duration. > > When memory registration latency is measured using [1] for memory sizes > ranging from 4K to 48GB, <= 1% or 0.5% degradation is noticed. In many > runs no difference is seen other than run-to-run variance. > > In other targeted tests of users with large memory, desired improvements > are seen due to reduced contention of mmap_sem. > > [1] https://github.com/paravmellanox/rtool > > ./rdma_resource_lat -c 1 -s 48G -a -u L -i 500 -A > It registers pinned memory from 4K to 48GB size with 500 iterations for > each memory size. > > ./rdma_resource_lat -c 1 -s 12G -a -u L -i 500 -t 4 > 4 competing threads pinns memory, each of 12GB size with 500 iterations. > > Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > drivers/infiniband/core/umem.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) Applied to for-next, thanks Jason