On 11/04/2024 16:00, David Hildenbrand wrote: > On 11.04.24 16:54, Ryan Roberts wrote: >> On 09/04/2024 09:26, Barry Song wrote: >>> From: Barry Song <v-songbaohua@xxxxxxxx> >>> >>> Add a boolean argument named any_shared. If any of the swap entries are >>> non-exclusive, set any_shared to true. The function do_swap_page() can >>> then utilize this information to determine whether the entire large >>> folio can be reused. >>> >>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> >>> --- >>> mm/internal.h | 9 ++++++++- >>> mm/madvise.c | 2 +- >>> mm/memory.c | 2 +- >>> 3 files changed, 10 insertions(+), 3 deletions(-) >>> >>> diff --git a/mm/internal.h b/mm/internal.h >>> index 9d3250b4a08a..cae39c372bfc 100644 >>> --- a/mm/internal.h >>> +++ b/mm/internal.h >>> @@ -238,7 +238,8 @@ static inline pte_t pte_next_swp_offset(pte_t pte) >>> * >>> * Return: the number of table entries in the batch. >>> */ >>> -static inline int swap_pte_batch(pte_t *start_ptep, int max_nr, pte_t pte) >>> +static inline int swap_pte_batch(pte_t *start_ptep, int max_nr, pte_t pte, >>> + bool *any_shared) >> >> Please update the docs in the comment above this for the new param; follow >> folio_pte_batch()'s docs as a template. >> >>> { >>> pte_t expected_pte = pte_next_swp_offset(pte); >>> const pte_t *end_ptep = start_ptep + max_nr; >>> @@ -248,12 +249,18 @@ static inline int swap_pte_batch(pte_t *start_ptep, int >>> max_nr, pte_t pte) >>> VM_WARN_ON(!is_swap_pte(pte)); >>> VM_WARN_ON(non_swap_entry(pte_to_swp_entry(pte))); >>> + if (any_shared) >>> + *any_shared |= !pte_swp_exclusive(pte); >> >> This is different from the approach in folio_pte_batch(). It inits *any_shared >> to false and does NOT include the value of the first pte. I think that's odd, >> personally and I prefer your approach. I'm not sure if there was a good reason >> that David chose the other approach? > > Because in my case calling code does > > nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr, flags, > &any_writable); > > ... > > if (any_writable) > pte = pte_mkwrite(pte, src_vma); > > ... > > and later checks in another function pte_write(). > > So if the common pattern is that the original PTE will be used for checks, then > it doesn't make sense to unnecessary checks+setting for the first PTE. Yep understood. And I think adopting your semantics for any_shared actually simplifies the code in patch 4 too; I've just commented that.