+ mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge
has been added to the -mm tree.  Its filename is
     mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Johannes Weiner <hannes@xxxxxxxxxxx>
Subject: mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge

0c59b89 ("mm: memcg: push down PageSwapCache check into uncharge entry
functions") added a VM_BUG_ON() on PageSwapCache in the uncharge path
after checking that page flag once, assuming that the state is stable in
all paths, but this is not the case and the condition triggers in user
environments.  An uncharge after the last page table reference to the page
goes away can race with reclaim adding the page to swap cache.

Swap cache pages are usually uncharged when they are freed after swapout,
from a path that also handles swap usage accounting and memcg lifetime
management.  However, since the last page table reference is gone and thus
no references to the swap slot left, the swap slot will be freed shortly
when reclaim attempts to write the page to disk.  The whole swap
accounting is not even necessary.

So while the race condition for which this VM_BUG_ON was added is real and
actually existed all along, there are no negative effects.  Remove the
VM_BUG_ON again.

Reported-by: Heiko Carstens <heiko.carstens@xxxxxxxxxx>
Reported-by: Lingzhu Xiang <lxiang@xxxxxxxxxx>
Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Acked-by: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>

Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/memcontrol.c |   14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff -puN mm/memcontrol.c~mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge mm/memcontrol.c
--- a/mm/memcontrol.c~mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge
+++ a/mm/memcontrol.c
@@ -4108,8 +4108,6 @@ __mem_cgroup_uncharge_common(struct page
 	if (mem_cgroup_disabled())
 		return NULL;
 
-	VM_BUG_ON(PageSwapCache(page));
-
 	if (PageTransHuge(page)) {
 		nr_pages <<= compound_order(page);
 		VM_BUG_ON(!PageTransHuge(page));
@@ -4205,6 +4203,18 @@ void mem_cgroup_uncharge_page(struct pag
 	if (page_mapped(page))
 		return;
 	VM_BUG_ON(page->mapping && !PageAnon(page));
+	/*
+	 * If the page is in swap cache, uncharge should be deferred
+	 * to the swap path, which also properly accounts swap usage
+	 * and handles memcg lifetime.
+	 *
+	 * Note that this check is not stable and reclaim may add the
+	 * page to swap cache at any time after this.  However, if the
+	 * page is not in swap cache by the time page->mapcount hits
+	 * 0, there won't be any page table references to the swap
+	 * slot, and reclaim will free it and not actually write the
+	 * page to disk.
+	 */
 	if (PageSwapCache(page))
 		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
_

Patches currently in -mm which might be from hannes@xxxxxxxxxxx are

origin.patch
mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge.patch
mm-memmap_init_zone-performance-improvement.patch
memcg-debugging-facility-to-access-dangling-memcgs.patch
debugging-keep-track-of-page-owners-fix-2-fix-fix-fix.patch

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]