On Wed, Jun 10, 2015 at 10:23:05AM +0300, Cyrill Gorcunov wrote: > On Wed, Jun 10, 2015 at 08:52:06AM +0900, Minchan Kim wrote: > > > > +++ b/mm/memory.c > > > > @@ -2557,9 +2557,11 @@ static int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, > > > > > > > > inc_mm_counter_fast(mm, MM_ANONPAGES); > > > > dec_mm_counter_fast(mm, MM_SWAPENTS); > > > > - pte = mk_pte(page, vma->vm_page_prot); > > > > + > > > > + /* Mark dirty bit of page table because MADV_FREE relies on it */ > > > > + pte = pte_mkdirty(mk_pte(page, vma->vm_page_prot)); > > > > if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) { > > > > - pte = maybe_mkwrite(pte_mkdirty(pte), vma); > > > > + pte = maybe_mkwrite(pte, vma); > > > > flags &= ~FAULT_FLAG_WRITE; > > > > ret |= VM_FAULT_WRITE; > > > > exclusive = 1; > > > > > > Hi Minchan! Really sorry for delay in reply. Look, I don't understand > > > the moment -- if page has fault on read then before the patch the > > > PTE won't carry the dirty flag but now we do set it up unconditionally > > > and to me it looks somehow strange at least because this as well > > > sets soft-dirty bit on pages which were not modified but only swapped > > > out. Am I missing something obvious? > > > > It's same one I sent a while ago and you said it's okay at that time. ;-) > > Ah, I recall. If there is no way to escape dirtifying the page in pte itself > maybe we should at least not make it softdirty on read faults? You mean this? diff --git a/mm/memory.c b/mm/memory.c index e1c45d0..c95340d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2557,9 +2557,14 @@ static int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, inc_mm_counter_fast(mm, MM_ANONPAGES); dec_mm_counter_fast(mm, MM_SWAPENTS); - pte = mk_pte(page, vma->vm_page_prot); + + /* Mark dirty bit of page table because MADV_FREE relies on it */ + pte = pte_mkdirty(mk_pte(page, vma->vm_page_prot)); + if (!flgas & FAULT_FLAG_WRITE) + pte = pte_clear_flags(pte, _PAGE_SOFT_DIRTY) + if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) { - pte = maybe_mkwrite(pte_mkdirty(pte), vma); + pte = maybe_mkwrite(pte, vma); flags &= ~FAULT_FLAG_WRITE; ret |= VM_FAULT_WRITE; exclusive = 1; It could be doable if everyone doesn't have strong objection on this patchset. I will wait more review. Thanks. > > > Okay, It might be lack of description compared to one I sent long time ago > > because I moved some part of description to another patch and I didn't Cc > > you. Sorry. I hope below will remind you. > > > > https://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg857827.html > > > > In summary, the problem is that in MADV_FREE point of view, > > clean anonymous page(ie, no dirty) in page table entry has a problem > > about sudden discarding under us by reclaimer. Otherwise, VM cannot > > discard MADV_FREE hinted pages by PageDirty flag of page descriptor. > > > > This patchset aims for solving the problem. > > Please feel free to ask if you have questions without wasting your time > > unless you can remind after reading above URL > > > > Thanks for looking! -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>