On Tue, Jun 27, 2023 at 3:57 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > > On 27/06/2023 04:01, Yu Zhao wrote: > > On Mon, Jun 26, 2023 at 11:15 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > >> > >> With all of the enabler patches in place, modify the anonymous memory > >> write allocation path so that it opportunistically attempts to allocate > >> a large folio up to `max_anon_folio_order()` size (This value is > >> ultimately configured by the architecture). This reduces the number of > >> page faults, reduces the size of (e.g. LRU) lists, and generally > >> improves performance by batching what were per-page operations into > >> per-(large)-folio operations. > >> > >> If CONFIG_LARGE_ANON_FOLIO is not enabled (the default) then > >> `max_anon_folio_order()` always returns 0, meaning we get the existing > >> allocation behaviour. > >> > >> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx> > >> --- > >> mm/memory.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++----- > >> 1 file changed, 144 insertions(+), 15 deletions(-) > >> > >> diff --git a/mm/memory.c b/mm/memory.c > >> index a8f7e2b28d7a..d23c44cc5092 100644 > >> --- a/mm/memory.c > >> +++ b/mm/memory.c > >> @@ -3161,6 +3161,90 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma) > >> return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; > >> } > >> > >> +/* > >> + * Returns index of first pte that is not none, or nr if all are none. > >> + */ > >> +static inline int check_ptes_none(pte_t *pte, int nr) > >> +{ > >> + int i; > >> + > >> + for (i = 0; i < nr; i++) { > >> + if (!pte_none(ptep_get(pte++))) > >> + return i; > >> + } > >> + > >> + return nr; > >> +} > >> + > >> +static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) > > > > As suggested previously in 03/10, we can leave this for later. > > I disagree. This is the logic that prevents us from accidentally replacing > already set PTEs, or wandering out of the VMA bounds etc. How would you catch > all those corener cases without this? Again, sorry for not being clear previously: we definitely need to handle alignments & overlapps. But the fallback, i.e., "for (; order > 1; order--) {" in calc_anon_folio_order_alloc() is not necessary. For now, we just need something like bool is_order_suitable() { // check whether it fits properly } Later on, we could add alloc_anon_folio_best_effort() { for a list of fallback orders is_order_suitable() }