Re: [RFC PATCH 18/18] Documentation: add document for pte_ref

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2022/4/30 9:19 PM, Bagas Sanjaya wrote:
Hi Qi,

On Fri, Apr 29, 2022 at 09:35:52PM +0800, Qi Zheng wrote:
+Now in order to pursue high performance, applications mostly use some
+high-performance user-mode memory allocators, such as jemalloc or tcmalloc.
+These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release
+physical memory for the following reasons::
+
+ First of all, we should hold as few write locks of mmap_lock as possible,
+ since the mmap_lock semaphore has long been a contention point in the
+ memory management subsystem. The mmap()/munmap() hold the write lock, and
+ the madvise(MADV_DONTNEED or MADV_FREE) hold the read lock, so using
+ madvise() instead of munmap() to released physical memory can reduce the
+ competition of the mmap_lock.
+
+ Secondly, after using madvise() to release physical memory, there is no
+ need to build vma and allocate page tables again when accessing the same
+ virtual address again, which can also save some time.
+

I think we can use enumerated list, like below:

Thanks for your review, LGTM, will do.


-- >8 --

diff --git a/Documentation/vm/pte_ref.rst b/Documentation/vm/pte_ref.rst
index 0ac1e5a408d7c6..67b18e74fcb367 100644
--- a/Documentation/vm/pte_ref.rst
+++ b/Documentation/vm/pte_ref.rst
@@ -10,18 +10,18 @@ Preface
  Now in order to pursue high performance, applications mostly use some
  high-performance user-mode memory allocators, such as jemalloc or tcmalloc.
  These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release
-physical memory for the following reasons::
-
- First of all, we should hold as few write locks of mmap_lock as possible,
- since the mmap_lock semaphore has long been a contention point in the
- memory management subsystem. The mmap()/munmap() hold the write lock, and
- the madvise(MADV_DONTNEED or MADV_FREE) hold the read lock, so using
- madvise() instead of munmap() to released physical memory can reduce the
- competition of the mmap_lock.
-
- Secondly, after using madvise() to release physical memory, there is no
- need to build vma and allocate page tables again when accessing the same
- virtual address again, which can also save some time.
+physical memory for the following reasons:
+
+1. We should hold as few write locks of mmap_lock as possible,
+   since the mmap_lock semaphore has long been a contention point in the
+   memory management subsystem. The mmap()/munmap() hold the write lock, and
+   the madvise(MADV_DONTNEED or MADV_FREE) hold the read lock, so using
+   madvise() instead of munmap() to released physical memory can reduce the
+   competition of the mmap_lock.
+
+2. After using madvise() to release physical memory, there is no
+   need to build vma and allocate page tables again when accessing the same
+   virtual address again, which can also save some time.
The following is the largest user PTE page table memory that can be
  allocated by a single user process in a 32-bit and a 64-bit system.

+The following is the largest user PTE page table memory that can be
+allocated by a single user process in a 32-bit and a 64-bit system.
+

We can say "assuming 4K page size" here,

++---------------------------+--------+---------+
+|                           | 32-bit | 64-bit  |
++===========================+========+=========+
+| user PTE page table pages | 3 MiB  | 512 GiB |
++---------------------------+--------+---------+
+| user PMD page table pages | 3 KiB  | 1 GiB   |
++---------------------------+--------+---------+
+
+(for 32-bit, take 3G user address space, 4K page size as an example;
+ for 64-bit, take 48-bit address width, 4K page size as an example.)
+

... instead of here.

will do.


+There is also a lock-less scenario(such as fast GUP). Fortunately, we don't need
+to do any additional operations to ensure that the system is in order. Take fast
+GUP as an example::
+
+	thread A		thread B
+	fast GUP		madvise(MADV_DONTNEED)
+	========		======================
+
+	get_user_pages_fast_only()
+	--> local_irq_save();
+				call_rcu(pte_free_rcu)
+	    gup_pgd_range();
+	    local_irq_restore();
+	    			/* do pte_free_rcu() */
+

I see whitespace warning circa do pte_free_rcu() line above when
applying this series.

will fix.

Thanks,
Qi


Thanks.


--
Thanks,
Qi



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux