On 03/05/2024 01:50, Barry Song wrote: > From: Barry Song <v-songbaohua@xxxxxxxx> > > Should do_swap_page() have the capability to directly map a large folio, > metadata restoration becomes necessary for a specified number of pages > denoted as nr. It's important to highlight that metadata restoration is > solely required by the SPARC platform, which, however, does not enable > THP_SWAP. Consequently, in the present kernel configuration, there > exists no practical scenario where users necessitate the restoration of > nr metadata. Platforms implementing THP_SWAP might invoke this function > with nr values exceeding 1, subsequent to do_swap_page() successfully > mapping an entire large folio. Nonetheless, their arch_do_swap_page_nr() > functions remain empty. > > Cc: Khalid Aziz <khalid.aziz@xxxxxxxxxx> > Cc: "David S. Miller" <davem@xxxxxxxxxxxxx> > Cc: Andreas Larsson <andreas@xxxxxxxxxxx> > Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > --- > include/linux/pgtable.h | 26 ++++++++++++++++++++------ > mm/memory.c | 3 ++- > 2 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 18019f037bae..463e84c3de26 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1084,6 +1084,15 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) > }) > > #ifndef __HAVE_ARCH_DO_SWAP_PAGE > +static inline void arch_do_swap_page_nr(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long addr, > + pte_t pte, pte_t oldpte, > + int nr) > +{ > + > +} > +#else > /* > * Some architectures support metadata associated with a page. When a > * page is being swapped out, this metadata must be saved so it can be > @@ -1092,12 +1101,17 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) > * page as metadata for the page. arch_do_swap_page() can restore this > * metadata when a page is swapped back in. > */ > -static inline void arch_do_swap_page(struct mm_struct *mm, > - struct vm_area_struct *vma, > - unsigned long addr, > - pte_t pte, pte_t oldpte) This hook seems to be very similar to arch_swap_restore(), I wonder if it makes sense to merge them. Out of scope for this patch series though. > -{ > - > +static inline void arch_do_swap_page_nr(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long addr, > + pte_t pte, pte_t oldpte, > + int nr) > +{ > + for (int i = 0; i < nr; i++) { > + arch_do_swap_page(vma->vm_mm, vma, addr + i * PAGE_SIZE, > + pte_advance_pfn(pte, i), > + pte_advance_pfn(oldpte, i)); It seems a bit odd to create a batched version of this, but not allow arches to take advantage. Although I guess your point is that only SPARC implements it and on that platform nr will always be 1? So no point right now? So this is just a convenience for do_swap_page()? Makes sense. Reviewed-by: Ryan Roberts <ryan.roberts@xxxxxxx> > + } > } > #endif > > diff --git a/mm/memory.c b/mm/memory.c > index f033eb3528ba..74cdefd58f5f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4266,7 +4266,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > VM_BUG_ON(!folio_test_anon(folio) || > (pte_write(pte) && !PageAnonExclusive(page))); > set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); > - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); > + arch_do_swap_page_nr(vma->vm_mm, vma, vmf->address, > + pte, vmf->orig_pte, 1); > > folio_unlock(folio); > if (folio != swapcache && swapcache) {