[to-be-updated] hugetlbfs-take-read_lock-on-i_mmap-for-pmd-sharing.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: hugetlbfs: take read_lock on i_mmap for PMD sharing
has been removed from the -mm tree.  Its filename was
     hugetlbfs-take-read_lock-on-i_mmap-for-pmd-sharing.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Waiman Long <longman@xxxxxxxxxx>
Subject: hugetlbfs: take read_lock on i_mmap for PMD sharing

A customer with large SMP systems (up to 16 sockets) with application that
uses large amount of static hugepages (~500-1500GB) are experiencing
random multisecond delays.  These delays were caused by the long time it
took to scan the VMA interval tree with mmap_sem held.

The sharing of huge PMD does not require changes to the i_mmap at all. 
Therefore, we can just take the read lock and let other threads searching
for the right VMA share it in parallel.  Once the right VMA is found,
either the PMD lock (2M huge page for x86-64) or the mm->page_table_lock
will be acquired to perform the actual PMD sharing.

Lock contention, if present, will happen in the spinlock.  That is much
better than contention in the rwsem where the time needed to scan the the
interval tree is indeterminate.

With this patch applied, the customer is seeing significant performance
improvement over the unpatched kernel.

Note that huge_pmd_share now increments the page count with the
semaphore held just in read mode.  It is OK to do increments in
parallel without synchronization.  However, we don't want anyone else
changing the count while that check in huge_pmd_unshare is happening. 
Hence, the need for taking the semaphore in write mode.

[mike.kravetz@xxxxxxxxxx: changelog additions]
Link: http://lkml.kernel.org/r/20191107211809.9539-1-longman@xxxxxxxxxx
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Suggested-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/hugetlb.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c~hugetlbfs-take-read_lock-on-i_mmap-for-pmd-sharing
+++ a/mm/hugetlb.c
@@ -4769,7 +4769,7 @@ pte_t *huge_pmd_share(struct mm_struct *
 	if (!vma_shareable(vma, addr))
 		return (pte_t *)pmd_alloc(mm, pud, addr);
 
-	i_mmap_lock_write(mapping);
+	i_mmap_lock_read(mapping);
 	vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) {
 		if (svma == vma)
 			continue;
@@ -4799,7 +4799,7 @@ pte_t *huge_pmd_share(struct mm_struct *
 	spin_unlock(ptl);
 out:
 	pte = (pte_t *)pmd_alloc(mm, pud, addr);
-	i_mmap_unlock_write(mapping);
+	i_mmap_unlock_read(mapping);
 	return pte;
 }
 
_

Patches currently in -mm which might be from longman@xxxxxxxxxx are





[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux