On Wed, Aug 16, 2023 at 03:33:30PM +0200, David Hildenbrand wrote: > On 15.08.23 23:25, Peter Xu wrote: > > Tail page struct reuse is over-comlicated. Not only because we have > > It is complicated, agreed. > > With the ->private for THP_SWAP gone, we would have to document less. > Stating that 4*4byte / 4*8 byte are available after flags+head would > be sufficient and I'd even drop the table. > > > > implicit uses of tail page fields (mapcounts, or private for thp swap > > support, etc., that we may still use in the page structs, > > Instead of documenting that thp swap should no longer touch the private > field of tail pages, maybe we can indeed fix that quite easily. > > My simple tests passed so far. If there isn't something obvious missing, > I can do more testing and send this as an official patch. It'll be definitely good to fix it rather than document if possible. Nothing wrong I spot quickly, you may just need a more complete cc list for swap. One trivial comment below. > > > From ec0f8b0dd8fb81c316b6a4c5fc9ae7563e625404 Mon Sep 17 00:00:00 2001 > From: David Hildenbrand <david@xxxxxxxxxx> > Date: Wed, 16 Aug 2023 13:14:45 +0200 > Subject: [PATCH] mm/swap: stop using page->private on tail pages for THP_SWAP > > Let's stop using page->private on tail pages, making it possible to > just unconditionally reuse that field in the tail pages of large folios. > > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > --- > arch/arm64/mm/mteswap.c | 5 +++-- > include/linux/swap.h | 9 +++++++++ > mm/huge_memory.c | 15 ++++++--------- > mm/memory.c | 2 +- > mm/rmap.c | 2 +- > mm/swap_state.c | 4 ++-- > mm/swapfile.c | 4 ++-- > 7 files changed, 24 insertions(+), 17 deletions(-) > > diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c > index cd508ba80ab1..a31833e3ddc5 100644 > --- a/arch/arm64/mm/mteswap.c > +++ b/arch/arm64/mm/mteswap.c > @@ -33,8 +33,9 @@ int mte_save_tags(struct page *page) > mte_save_page_tags(page_address(page), tag_storage); > - /* page_private contains the swap entry.val set in do_swap_page */ > - ret = xa_store(&mte_pages, page_private(page), tag_storage, GFP_KERNEL); > + /* lookup the swap entry.val from the page */ > + ret = xa_store(&mte_pages, page_swap_entry(page).val, tag_storage, > + GFP_KERNEL); > if (WARN(xa_is_err(ret), "Failed to store MTE tags")) { > mte_free_tag_storage(tag_storage); > return xa_err(ret); > diff --git a/include/linux/swap.h b/include/linux/swap.h > index bb5adc604144..84fe0e94f5cd 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -339,6 +339,15 @@ static inline swp_entry_t folio_swap_entry(struct folio *folio) > return entry; > } > +static inline swp_entry_t page_swap_entry(struct page *page) > +{ > + struct folio *folio = page_folio(page); > + swp_entry_t entry = folio_swap_entry(folio); > + > + entry.val += page - &folio->page; > + return entry; > +} > + > static inline void folio_set_swap_entry(struct folio *folio, swp_entry_t entry) > { > folio->private = (void *)entry.val; > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 0b709d2c46c6..f7e04cbcb063 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2451,18 +2451,15 @@ static void __split_huge_page_tail(struct page *head, int tail, > page_tail->index = head->index + tail; > /* > - * page->private should not be set in tail pages with the exception > - * of swap cache pages that store the swp_entry_t in tail pages. > - * Fix up and warn once if private is unexpectedly set. > - * > - * What of 32-bit systems, on which folio->_pincount overlays > - * head[1].private? No problem: THP_SWAP is not enabled on 32-bit, and > - * pincount must be 0 for folio_ref_freeze() to have succeeded. > + * page->private should not be set in tail pages. Fix up and warn once > + * if private is unexpectedly set. > */ > - if (!folio_test_swapcache(page_folio(head))) { > - VM_WARN_ON_ONCE_PAGE(page_tail->private != 0, page_tail); > + if (unlikely(page_tail->private)) { > + VM_WARN_ON_ONCE_PAGE(true, page_tail); > page_tail->private = 0; > } > + if (PageSwapCache(head)) > + set_page_private(page_tail, (unsigned long)head->private + tail); > /* Page flags must be visible before we make the page non-compound. */ > smp_wmb(); > diff --git a/mm/memory.c b/mm/memory.c > index d003076b218d..ff13242c1589 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3882,7 +3882,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > * changed. > */ > if (unlikely(!folio_test_swapcache(folio) || > - page_private(page) != entry.val)) > + page_swap_entry(page).val != entry.val)) > goto out_page; > /* > diff --git a/mm/rmap.c b/mm/rmap.c > index 1f04debdc87a..ec7f8e6c9e48 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1647,7 +1647,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > */ > dec_mm_counter(mm, mm_counter(&folio->page)); > } else if (folio_test_anon(folio)) { > - swp_entry_t entry = { .val = page_private(subpage) }; > + swp_entry_t entry = page_swap_entry(subpage); > pte_t swp_pte; > /* > * Store the swap location in the pte. > diff --git a/mm/swap_state.c b/mm/swap_state.c > index 01f15139b7d9..450819934e34 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -100,6 +100,7 @@ int add_to_swap_cache(struct folio *folio, swp_entry_t entry, > folio_ref_add(folio, nr); > folio_set_swapcache(folio); > + folio_set_swap_entry(folio, entry); > do { > xas_lock_irq(&xas); > @@ -113,7 +114,6 @@ int add_to_swap_cache(struct folio *folio, swp_entry_t entry, > if (shadowp) > *shadowp = old; > } > - set_page_private(folio_page(folio, i), entry.val + i); > xas_store(&xas, folio); > xas_next(&xas); > } > @@ -154,9 +154,9 @@ void __delete_from_swap_cache(struct folio *folio, > for (i = 0; i < nr; i++) { > void *entry = xas_store(&xas, shadow); > VM_BUG_ON_PAGE(entry != folio, entry); > - set_page_private(folio_page(folio, i), 0); > xas_next(&xas); > } > + folio->private = 0; I'd rather remove all direct reference to "private" for swap alongside, if this would be the last spot (perhaps folio_set_swap_entry()). > folio_clear_swapcache(folio); > address_space->nrpages -= nr; > __node_stat_mod_folio(folio, NR_FILE_PAGES, -nr); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index d46933adf789..bd9d904671b9 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -3369,7 +3369,7 @@ struct swap_info_struct *swp_swap_info(swp_entry_t entry) > struct swap_info_struct *page_swap_info(struct page *page) > { > - swp_entry_t entry = { .val = page_private(page) }; > + swp_entry_t entry = page_swap_entry(page); > return swp_swap_info(entry); > } > @@ -3384,7 +3384,7 @@ EXPORT_SYMBOL_GPL(swapcache_mapping); > pgoff_t __page_file_index(struct page *page) > { > - swp_entry_t swap = { .val = page_private(page) }; > + swp_entry_t swap = page_swap_entry(page); > return swp_offset(swap); > } > EXPORT_SYMBOL_GPL(__page_file_index); > -- > 2.41.0 > > > -- > Cheers, > > David / dhildenb > -- Peter Xu