On Fri, 2 Aug 2024 17:55:13 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote: > Looking into a way of moving the last folio_likely_mapped_shared() call > in add_folio_for_migration() under the PTL, I found myself removing > follow_page(). This paves the way for cleaning up all the FOLL_, follow_* > terminology to just be called "GUP" nowadays. > > The new page table walker will lookup a mapped folio and return to the > caller with the PTL held, such that the folio cannot get unmapped > concurrently. Callers can then conditionally decide whether they really > want to take a short-term folio reference or whether the can simply > unlock the PTL and be done with it. > > folio_walk is similar to page_vma_mapped_walk(), except that we don't know > the folio we want to walk to and that we are only walking to exactly one > PTE/PMD/PUD. > > folio_walk provides access to the pte/pmd/pud (and the referenced folio > page because things like KSM need that), however, as part of this series > no page table modifications are performed by users. > > We might be able to convert some other walk_page_range() users that really > only walk to one address, such as DAMON with > damon_mkold_ops/damon_young_ops. It might make sense to extend folio_walk > in the future to optionally fault in a folio (if applicable), such that we > can replace some get_user_pages() users that really only want to lookup > a single page/folio under PTL without unconditionally grabbing a folio > reference. > > I have plans to extend the approach to a range walker that will try > batching various page table entries (not just folio pages) to be a better > replace for walk_page_range() -- and users will be able to opt in which > type of page table entries they want to process -- but that will require > more work and more thoughts. > > KSM seems to work just fine (ksm_functional_tests selftests) and > move_pages seems to work (migration selftest). I tested the leaf > implementation excessively using various hugetlb sizes (64K, 2M, 32M, 1G) > on arm64 using move_pages and did some more testing on x86-64. Cross > compiled on a bunch of architectures. > > I am not able to test the s390x Secure Execution changes, unfortunately. the series looks good; we will do some tests and report back if everything is ok > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> > Cc: Jonathan Corbet <corbet@xxxxxxx> > Cc: Christian Borntraeger <borntraeger@xxxxxxxxxxxxx> > Cc: Janosch Frank <frankja@xxxxxxxxxxxxx> > Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx> > Cc: Heiko Carstens <hca@xxxxxxxxxxxxx> > Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx> > Cc: Alexander Gordeev <agordeev@xxxxxxxxxxxxx> > Cc: Sven Schnelle <svens@xxxxxxxxxxxxx> > Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx> > > David Hildenbrand (11): > mm: provide vm_normal_(page|folio)_pmd() with > CONFIG_PGTABLE_HAS_HUGE_LEAVES > mm/pagewalk: introduce folio_walk_start() + folio_walk_end() > mm/migrate: convert do_pages_stat_array() from follow_page() to > folio_walk > mm/migrate: convert add_page_for_migration() from follow_page() to > folio_walk > mm/ksm: convert get_mergeable_page() from follow_page() to folio_walk > mm/ksm: convert scan_get_next_rmap_item() from follow_page() to > folio_walk > mm/huge_memory: convert split_huge_pages_pid() from follow_page() to > folio_walk > s390/uv: convert gmap_destroy_page() from follow_page() to folio_walk > s390/mm/fault: convert do_secure_storage_access() from follow_page() > to folio_walk > mm: remove follow_page() > mm/ksm: convert break_ksm() from walk_page_range_vma() to folio_walk > > Documentation/mm/transhuge.rst | 6 +- > arch/s390/kernel/uv.c | 18 ++- > arch/s390/mm/fault.c | 16 ++- > include/linux/mm.h | 3 - > include/linux/pagewalk.h | 58 ++++++++++ > mm/filemap.c | 2 +- > mm/gup.c | 24 +--- > mm/huge_memory.c | 18 +-- > mm/ksm.c | 127 +++++++++------------ > mm/memory.c | 2 +- > mm/migrate.c | 131 ++++++++++----------- > mm/nommu.c | 6 - > mm/pagewalk.c | 202 +++++++++++++++++++++++++++++++++ > 13 files changed, 413 insertions(+), 200 deletions(-) >