Basically, MADV_FREE relys on the dirty bit in page table entry to decide whether VM allows to discard the page or not. IOW, if page table entry includes marked dirty bit, VM shouldn't discard the page. However, if swapoff happens, page table entry point out the page doesn't have marked dirty bit so MADV_FREE might discard the page wrongly. To fix the problem, this patch marks page table entry of page as dirty when swapoff hanppens VM shouldn't discard the page suddenly under us. With MADV_FREE point of view, marking dirty unconditionally is no problem because we dropped swapped page in MADV_FREE sycall context(ie, Look at madvise_free_pte_range) so every swapping-in pages are no MADV_FREE hinted pages. Cc: Hugh Dickins <hughd@xxxxxxxxxx> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> --- mm/swapfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index a7e72103f23b..cc8b79ab2190 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1118,8 +1118,12 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); + /* + * For preventing sudden freeing by MADV_FREE, pte must have a + * dirty flag. + */ set_pte_at(vma->vm_mm, addr, pte, - pte_mkold(mk_pte(page, vma->vm_page_prot))); + pte_mkdirty(pte_mkold(mk_pte(page, vma->vm_page_prot)))); if (page == swapcache) { page_add_anon_rmap(page, vma, addr); mem_cgroup_commit_charge(page, memcg, true); -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>