On Thu, Mar 12, 2020 at 02:41:07PM -0700, Dave Hansen wrote: > One other fun thing. I have a "victim" thread sitting in a loop doing: > > sleep(1) > memcpy(&garbage, buffer, sz); > > The "attacker" is doing > > madvise(buffer, sz, MADV_PAGEOUT); > > in a loop. That, oddly enough doesn't cause the victim to page fault. > But, if I do: > > memcpy(&garbage, buffer, sz); > madvise(buffer, sz, MADV_PAGEOUT); > > It *does* cause the memory to get paged out. The MADV_PAGEOUT code > actually has a !pte_present() check. It will punt on a PTE if it sees > it. In other words, if a page is in the swap cache but not mapped by a > pte_present() PTE, MADV_PAGEOUT won't touch it. > > Shouldn't MADV_PAGEOUT be able to find and reclaim those pages? Patch > attached. > > > --- > > b/mm/madvise.c | 38 +++++++++++++++++++++++++++++++------- > 1 file changed, 31 insertions(+), 7 deletions(-) > > diff -puN mm/madvise.c~madv-pageout-find-swap-cache mm/madvise.c > --- a/mm/madvise.c~madv-pageout-find-swap-cache 2020-03-12 14:24:45.178775035 -0700 > +++ b/mm/madvise.c 2020-03-12 14:35:49.706773378 -0700 > @@ -248,6 +248,36 @@ static void force_shm_swapin_readahead(s > #endif /* CONFIG_SWAP */ > > /* > + * Given a PTE, find the corresponding 'struct page'. Also handles > + * non-present swap PTEs. > + */ > +struct page *pte_to_reclaim_page(struct vm_area_struct *vma, > + unsigned long addr, pte_t ptent) > +{ > + swp_entry_t entry; > + > + /* Totally empty PTE: */ > + if (pte_none(ptent)) > + return NULL; > + > + /* A normal, present page is mapped: */ > + if (pte_present(ptent)) > + return vm_normal_page(vma, addr, ptent); > + Please check is_swap_pte first. > + entry = pte_to_swp_entry(vmf->orig_pte); > + /* Is it one of the "swap PTEs" that's not really swap? */ > + if (non_swap_entry(entry)) > + return false; > + > + /* > + * The PTE was a true swap entry. The page may be in the > + * swap cache. If so, find it and return it so it may be > + * reclaimed. > + */ > + return lookup_swap_cache(entry, vma, addr); If we go with handling only exclusived owned page for anon, I think we should apply the rule to swap cache, too. Do you mind posting it as formal patch? Thanks for the explain about vulnerability and the patch, Dave! > +} > + > +/* > * Schedule all required I/O operations. Do not wait for completion. > */ > static long madvise_willneed(struct vm_area_struct *vma, > @@ -389,13 +419,7 @@ regular_page: > for (; addr < end; pte++, addr += PAGE_SIZE) { > ptent = *pte; > > - if (pte_none(ptent)) > - continue; > - > - if (!pte_present(ptent)) > - continue; > - > - page = vm_normal_page(vma, addr, ptent); > + page = pte_to_reclaim_page(vma, addr, ptent); > if (!page) > continue; > > _