Re: [PATCH v1 10/10] mm: Allocate large folios for anonymous memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 27, 2023 at 3:57 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:

On 27/06/2023 04:01, Yu Zhao wrote:
On Mon, Jun 26, 2023 at 11:15 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:

With all of the enabler patches in place, modify the anonymous memory
write allocation path so that it opportunistically attempts to allocate
a large folio up to `max_anon_folio_order()` size (This value is
ultimately configured by the architecture). This reduces the number of
page faults, reduces the size of (e.g. LRU) lists, and generally
improves performance by batching what were per-page operations into
per-(large)-folio operations.

If CONFIG_LARGE_ANON_FOLIO is not enabled (the default) then
`max_anon_folio_order()` always returns 0, meaning we get the existing
allocation behaviour.

Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
---
 mm/memory.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 144 insertions(+), 15 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index a8f7e2b28d7a..d23c44cc5092 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3161,6 +3161,90 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma)
                return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX;
 }

+/*
+ * Returns index of first pte that is not none, or nr if all are none.
+ */
+static inline int check_ptes_none(pte_t *pte, int nr)
+{
+       int i;
+
+       for (i = 0; i < nr; i++) {
+               if (!pte_none(ptep_get(pte++)))
+                       return i;
+       }
+
+       return nr;
+}
+
+static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order)

As suggested previously in 03/10, we can leave this for later.

I disagree. This is the logic that prevents us from accidentally replacing
already set PTEs, or wandering out of the VMA bounds etc. How would you catch
all those corener cases without this?

Again, sorry for not being clear previously: we definitely need to
handle alignments & overlapps. But the fallback, i.e., "for (; order >
1; order--) {" in calc_anon_folio_order_alloc() is not necessary.

For now, we just need something like

  bool is_order_suitable() {
    // check whether it fits properly
  }

Later on, we could add

  alloc_anon_folio_best_effort()
  {
    for a list of fallback orders
      is_order_suitable()
  }




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux