+ mmthprmap-fix-races-between-updates-of-subpages_mapcount.patch added to mm-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm,thp,rmap: fix races between updates of subpages_mapcount
has been added to the -mm mm-unstable branch.  Its filename is
     mmthprmap-fix-races-between-updates-of-subpages_mapcount.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mmthprmap-fix-races-between-updates-of-subpages_mapcount.patch

This patch will later appear in the mm-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Hugh Dickins <hughd@xxxxxxxxxx>
Subject: mm,thp,rmap: fix races between updates of subpages_mapcount
Date: Sun, 4 Dec 2022 17:57:07 -0800 (PST)

Commit 4b51634cd16a, introducing the COMPOUND_MAPPED bit, paid attention
to the impossibility of subpages_mapcount ever appearing negative; but did
not attend to those races in which it can momentarily appear larger than
thought possible.

These arise from how page_remove_rmap() first decrements page->_mapcount
or compound_mapcount, then, if that transition goes negative (logical 0),
decrements subpages_mapcount.  The initial decrement lets a racing
page_add_*_rmap() reincrement _mapcount or compound_mapcount immediately,
and then in rare cases its corresponding increment of subpages_mapcount
may be completed before page_remove_rmap()'s decrement.  There could even
(with increasing unlikelihood) be a series of increments intermixed with
the decrements.

In practice, checking subpages_mapcount with a temporary WARN on range,
has caught values of 0x1000000 (2*COMPOUND_MAPPED, when move_pages() was
using remove_migration_pmd()) and 0x800201 (do_huge_pmd_wp_page() using
__split_huge_pmd()): page_add_anon_rmap() racing page_remove_rmap(), as
predicted.

I certainly found it harder to reason about than when bit_spin_locked, but
the easy case gives a clue to how to handle the harder case.  The easy
case being the three !(nr & COMPOUND_MAPPED) checks, which should
obviously be replaced by (nr < COMPOUND_MAPPED) checks - to count a page
as compound mapped, even while the bit in that position is 0.

The harder case is when trying to decide how many subpages are newly
covered or uncovered, when compound map is first added or last removed:
not knowing all that racily happened between first and second atomic ops.

But the easy way to handle that, is again to count the page as compound
mapped all the while that its subpages_mapcount indicates so - ignoring
the _mapcount or compound_mapcount transition while it is on the way to
being reversed.

Link: https://lkml.kernel.org/r/4388158-3092-a960-ff2d-55f2b0fe4ef8@xxxxxxxxxx
Fixes: 4b51634cd16a ("mm,thp,rmap: subpages_mapcount COMPOUND_MAPPED if PMD-mapped")
Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: James Houghton <jthoughton@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: John Hubbard <jhubbard@xxxxxxxxxx>
Cc: "Kirill A . Shutemov" <kirill@xxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Miaohe Lin <linmiaohe@xxxxxxxxxx>
Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Cc: Mina Almasry <almasrymina@xxxxxxxxxx>
Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx>
Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxxxx>
Cc: Peter Xu <peterx@xxxxxxxxxx>
Cc: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Yang Shi <shy828301@xxxxxxxxx>
Cc: Zach O'Keefe <zokeefe@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/rmap.c |   42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

--- a/mm/rmap.c~mmthprmap-fix-races-between-updates-of-subpages_mapcount
+++ a/mm/rmap.c
@@ -1232,7 +1232,7 @@ void page_add_anon_rmap(struct page *pag
 		if (first && PageCompound(page)) {
 			mapped = subpages_mapcount_ptr(compound_head(page));
 			nr = atomic_inc_return_relaxed(mapped);
-			nr = !(nr & COMPOUND_MAPPED);
+			nr = (nr < COMPOUND_MAPPED);
 		}
 	} else if (PageTransHuge(page)) {
 		/* That test is redundant: it's for safety or to optimize out */
@@ -1241,8 +1241,16 @@ void page_add_anon_rmap(struct page *pag
 		if (first) {
 			mapped = subpages_mapcount_ptr(page);
 			nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
-			nr_pmdmapped = thp_nr_pages(page);
-			nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+			if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
+				nr_pmdmapped = thp_nr_pages(page);
+				nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+				/* Raced ahead of a remove and another add? */
+				if (unlikely(nr < 0))
+					nr = 0;
+			} else {
+				/* Raced ahead of a remove of COMPOUND_MAPPED */
+				nr = 0;
+			}
 		}
 	}
 
@@ -1330,7 +1338,7 @@ void page_add_file_rmap(struct page *pag
 		if (first && PageCompound(page)) {
 			mapped = subpages_mapcount_ptr(compound_head(page));
 			nr = atomic_inc_return_relaxed(mapped);
-			nr = !(nr & COMPOUND_MAPPED);
+			nr = (nr < COMPOUND_MAPPED);
 		}
 	} else if (PageTransHuge(page)) {
 		/* That test is redundant: it's for safety or to optimize out */
@@ -1339,8 +1347,16 @@ void page_add_file_rmap(struct page *pag
 		if (first) {
 			mapped = subpages_mapcount_ptr(page);
 			nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
-			nr_pmdmapped = thp_nr_pages(page);
-			nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+			if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
+				nr_pmdmapped = thp_nr_pages(page);
+				nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+				/* Raced ahead of a remove and another add? */
+				if (unlikely(nr < 0))
+					nr = 0;
+			} else {
+				/* Raced ahead of a remove of COMPOUND_MAPPED */
+				nr = 0;
+			}
 		}
 	}
 
@@ -1387,7 +1403,7 @@ void page_remove_rmap(struct page *page,
 		if (last && PageCompound(page)) {
 			mapped = subpages_mapcount_ptr(compound_head(page));
 			nr = atomic_dec_return_relaxed(mapped);
-			nr = !(nr & COMPOUND_MAPPED);
+			nr = (nr < COMPOUND_MAPPED);
 		}
 	} else if (PageTransHuge(page)) {
 		/* That test is redundant: it's for safety or to optimize out */
@@ -1396,8 +1412,16 @@ void page_remove_rmap(struct page *page,
 		if (last) {
 			mapped = subpages_mapcount_ptr(page);
 			nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped);
-			nr_pmdmapped = thp_nr_pages(page);
-			nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+			if (likely(nr < COMPOUND_MAPPED)) {
+				nr_pmdmapped = thp_nr_pages(page);
+				nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED);
+				/* Raced ahead of another remove and an add? */
+				if (unlikely(nr < 0))
+					nr = 0;
+			} else {
+				/* An add of COMPOUND_MAPPED raced ahead */
+				nr = 0;
+			}
 		}
 	}
 
_

Patches currently in -mm which might be from hughd@xxxxxxxxxx are

tmpfs-fix-data-loss-from-failed-fallocate.patch
mm-memcg-fix-swapcached-stat-accounting.patch
mmthprmap-fix-races-between-updates-of-subpages_mapcount.patch




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux