On Wed 08-05-13 12:43:26, Andrew Morton wrote: > > The patch titled > Subject: mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge > has been added to the -mm tree. Its filename is > mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge.patch > > Before you just go and hit "reply", please: > a) Consider who else should be cc'ed > b) Prefer to cc a suitable mailing list as well > c) Ideally: find the original patch on the mailing list and do a > reply-to-all to that, adding suitable additional cc's > > *** Remember to use Documentation/SubmitChecklist when testing your code *** > > The -mm tree is included into linux-next and is updated > there every 3-4 working days > > ------------------------------------------------------ > From: Johannes Weiner <hannes@xxxxxxxxxxx> > Subject: mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge > > 0c59b89 ("mm: memcg: push down PageSwapCache check into uncharge entry > functions") added a VM_BUG_ON() on PageSwapCache in the uncharge path > after checking that page flag once, assuming that the state is stable in > all paths, but this is not the case and the condition triggers in user > environments. An uncharge after the last page table reference to the page > goes away can race with reclaim adding the page to swap cache. > > Swap cache pages are usually uncharged when they are freed after swapout, > from a path that also handles swap usage accounting and memcg lifetime > management. However, since the last page table reference is gone and thus > no references to the swap slot left, the swap slot will be freed shortly > when reclaim attempts to write the page to disk. The whole swap > accounting is not even necessary. > > So while the race condition for which this VM_BUG_ON was added is real and > actually existed all along, there are no negative effects. Remove the > VM_BUG_ON again. > > Reported-by: Heiko Carstens <heiko.carstens@xxxxxxxxxx> > Reported-by: Lingzhu Xiang <lxiang@xxxxxxxxxx> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> > Acked-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxx> Thanks! > Cc: <stable@xxxxxxxxxxxxxxx> > > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > mm/memcontrol.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff -puN mm/memcontrol.c~mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge mm/memcontrol.c > --- a/mm/memcontrol.c~mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge > +++ a/mm/memcontrol.c > @@ -4108,8 +4108,6 @@ __mem_cgroup_uncharge_common(struct page > if (mem_cgroup_disabled()) > return NULL; > > - VM_BUG_ON(PageSwapCache(page)); > - > if (PageTransHuge(page)) { > nr_pages <<= compound_order(page); > VM_BUG_ON(!PageTransHuge(page)); > @@ -4205,6 +4203,18 @@ void mem_cgroup_uncharge_page(struct pag > if (page_mapped(page)) > return; > VM_BUG_ON(page->mapping && !PageAnon(page)); > + /* > + * If the page is in swap cache, uncharge should be deferred > + * to the swap path, which also properly accounts swap usage > + * and handles memcg lifetime. > + * > + * Note that this check is not stable and reclaim may add the > + * page to swap cache at any time after this. However, if the > + * page is not in swap cache by the time page->mapcount hits > + * 0, there won't be any page table references to the swap > + * slot, and reclaim will free it and not actually write the > + * page to disk. > + */ > if (PageSwapCache(page)) > return; > __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false); > _ > > Patches currently in -mm which might be from hannes@xxxxxxxxxxx are > > origin.patch > mm-memcg-remove-incorrect-vm_bug_on-for-swap-cache-pages-in-uncharge.patch > mm-memmap_init_zone-performance-improvement.patch > memcg-debugging-facility-to-access-dangling-memcgs.patch > debugging-keep-track-of-page-owners-fix-2-fix-fix-fix.patch > -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html