The patch titled Subject: Re: mm: filemap: use xa_get_order() to get the swap entry order has been added to the -mm mm-unstable branch. Its filename is mm-filemap-use-xa_get_order-to-get-the-swap-entry-order-fix.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-filemap-use-xa_get_order-to-get-the-swap-entry-order-fix.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Hugh Dickins <hughd@xxxxxxxxxx> Subject: Re: mm: filemap: use xa_get_order() to get the swap entry order Date: Thu, 29 Aug 2024 01:07:17 -0700 (PDT) find_lock_entries(), used in the first pass of shmem_undo_range() and truncate_inode_pages_range() before partial folios are dealt with, has to be careful to avoid those partial folios: as its doc helpfully says, "Folios which are partially outside the range are not returned". Of course, the same must be true of any value entries returned, otherwise truncation and hole-punch risk erasing swapped areas - as has been seen. Rewrite find_lock_entries() to emphasize that, following the same pattern for folios and for value entries. Adjust find_get_entries() slightly, to get order while still holding rcu_read_lock(), and to round down the updated start: good changes, like find_lock_entries() now does, but it's unclear if either is ever important. Link: https://lkml.kernel.org/r/c336e6e4-da7f-b714-c0f1-12df715f2611@xxxxxxxxxx Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Cc: Barry Song <baohua@xxxxxxxxxx> Cc: Chris Li <chrisl@xxxxxxxxxx> Cc: Daniel Gomez <da.gomez@xxxxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: "Huang, Ying" <ying.huang@xxxxxxxxx> Cc: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> Cc: Lance Yang <ioworker0@xxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Pankaj Raghav <p.raghav@xxxxxxxxxxx> Cc: Ryan Roberts <ryan.roberts@xxxxxxx> Cc: Yang Shi <shy828301@xxxxxxxxx> Cc: Zi Yan <ziy@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/filemap.c | 41 +++++++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 16 deletions(-) --- a/mm/filemap.c~mm-filemap-use-xa_get_order-to-get-the-swap-entry-order-fix +++ a/mm/filemap.c @@ -2046,10 +2046,9 @@ unsigned find_get_entries(struct address if (!folio_batch_add(fbatch, folio)) break; } - rcu_read_unlock(); if (folio_batch_count(fbatch)) { - unsigned long nr = 1; + unsigned long nr; int idx = folio_batch_count(fbatch) - 1; folio = fbatch->folios[idx]; @@ -2057,8 +2056,10 @@ unsigned find_get_entries(struct address nr = folio_nr_pages(folio); else nr = 1 << xa_get_order(&mapping->i_pages, indices[idx]); - *start = indices[idx] + nr; + *start = round_down(indices[idx] + nr, nr); } + rcu_read_unlock(); + return folio_batch_count(fbatch); } @@ -2090,10 +2091,17 @@ unsigned find_lock_entries(struct addres rcu_read_lock(); while ((folio = find_get_entry(&xas, end, XA_PRESENT))) { + unsigned long base; + unsigned long nr; + if (!xa_is_value(folio)) { - if (folio->index < *start) + nr = folio_nr_pages(folio); + base = folio->index; + /* Omit large folio which begins before the start */ + if (base < *start) goto put; - if (folio_next_index(folio) - 1 > end) + /* Omit large folio which extends beyond the end */ + if (base + nr - 1 > end) goto put; if (!folio_trylock(folio)) goto put; @@ -2102,7 +2110,19 @@ unsigned find_lock_entries(struct addres goto unlock; VM_BUG_ON_FOLIO(!folio_contains(folio, xas.xa_index), folio); + } else { + nr = 1 << xa_get_order(&mapping->i_pages, xas.xa_index); + base = xas.xa_index & ~(nr - 1); + /* Omit order>0 value which begins before the start */ + if (base < *start) + continue; + /* Omit order>0 value which extends beyond the end */ + if (base + nr - 1 > end) + break; } + + /* Update start now so that last update is correct on return */ + *start = base + nr; indices[fbatch->nr] = xas.xa_index; if (!folio_batch_add(fbatch, folio)) break; @@ -2114,17 +2134,6 @@ put: } rcu_read_unlock(); - if (folio_batch_count(fbatch)) { - unsigned long nr = 1; - int idx = folio_batch_count(fbatch) - 1; - - folio = fbatch->folios[idx]; - if (!xa_is_value(folio)) - nr = folio_nr_pages(folio); - else - nr = 1 << xa_get_order(&mapping->i_pages, indices[idx]); - *start = indices[idx] + nr; - } return folio_batch_count(fbatch); } _ Patches currently in -mm which might be from hughd@xxxxxxxxxx are mm-attempt-to-batch-free-swap-entries-for-zap_pte_range-fix-2.patch mm-filemap-use-xa_get_order-to-get-the-swap-entry-order-fix.patch mm-shmem-split-large-entry-if-the-swapin-folio-is-not-large-fix.patch mm-shmem-support-large-folio-swap-out-fix.patch mm-restart-if-multiple-traversals-raced-fix.patch mm-shmem-fix-minor-off-by-one-in-shrinkable-calculation.patch mm-shmem-extend-shmem_unused_huge_shrink-to-all-sizes.patch