On 3/12/20 9:22 AM, Michal Hocko wrote: > [Cc akpm] > > So what about this? > > From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@xxxxxxxx> > Date: Thu, 12 Mar 2020 09:04:35 +0100 > Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages > > Jann has brought up a very interesting point [1]. While shared pages are > excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed > that way. This can lead to all sorts of hard to debug problems. E.g. > performance problems outlined by Daniel [2]. There are runtime > environments where there is a substantial memory shared among security > domains via CoW memory and a easy to reclaim way of that memory, which > MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation > in for the parent process which might be more privileged or even open > side channel attacks. The feasibility of the later is not really clear > to me TBH but there is no real reason for exposure at this stage. It > seems there is no real use case to depend on reclaiming CoW memory via > madvise at this stage so it is much easier to simply disallow it and > this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can > operate only on the exclusively owned memory which is a straightforward > semantic. > > [1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@xxxxxxxxxxxxxx > [2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@xxxxxxxxxxxxxx > Reported-by: Jann Horn <jannh@xxxxxxxxxx> Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD") Maybe CC stable? > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > mm/madvise.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 43b47d3fae02..4bb30ed6c8d2 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > } > > page = pmd_page(orig_pmd); > + > + /* Do not interfere with other mappings of this page */ > + if (page_mapcount(page) != 1) > + goto huge_unlock; > + > if (next - addr != HPAGE_PMD_SIZE) { > int err; > > - if (page_mapcount(page) != 1) > - goto huge_unlock; > - > get_page(page); > spin_unlock(ptl); > lock_page(page); > @@ -426,6 +428,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > continue; > } > > + /* Do not interfere with other mappings of this page */ > + if (page_mapcount(page) != 1) > + continue; > + > VM_BUG_ON_PAGE(PageTransCompound(page), page); > > if (pte_young(ptent)) { >