On Mon 24-08-20 11:36:22, Kirill Tkhai wrote: > On 22.08.2020 02:49, Peter Xu wrote: > > From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > > > How about we just make sure we're the only possible valid user fo the > > page before we bother to reuse it? > > > > Simplify, simplify, simplify. > > > > And get rid of the nasty serialization on the page lock at the same time. > > > > Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > [peterx: add subject prefix] > > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > > --- > > mm/memory.c | 59 +++++++++++++++-------------------------------------- > > 1 file changed, 17 insertions(+), 42 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 602f4283122f..cb9006189d22 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -2927,50 +2927,25 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) > > * not dirty accountable. > > */ > > if (PageAnon(vmf->page)) { > > - int total_map_swapcount; > > - if (PageKsm(vmf->page) && (PageSwapCache(vmf->page) || > > - page_count(vmf->page) != 1)) > > + struct page *page = vmf->page; > > + > > + /* PageKsm() doesn't necessarily raise the page refcount */ > > No, this is wrong. PageKSM() always raises refcount. OK, then I'm confused. The comment before get_ksm_page() states: * get_ksm_page: checks if the page indicated by the stable node * is still its ksm page, despite having held no reference to it. * In which case we can trust the content of the page, and it * returns the gotten page; but if the page has now been zapped, * remove the stale node from the stable tree and return NULL. ... * You would expect the stable_node to hold a reference to the ksm page. * But if it increments the page's count, swapping out has to wait for * ksmd to come around again before it can free the page, which may take * seconds or even minutes: much too unresponsive. So instead we use a * "keyhole reference": access to the ksm page from the stable node peeps * out through its keyhole to see if that page still holds the right key, * pointing back to this stable node. So this all seems to indicate that KSM doesn't hold a proper page reference and relies on anyone making page writeable to change page->mapping so that KSM notices this and doesn't use the page anymore... Am I missing something? > There was another > problem: KSM may raise refcount without lock_page(), and only then it > takes the lock. See get_ksm_page(GET_KSM_PAGE_NOLOCK) for the details. > > So, reliable protection against parallel access requires to freeze page > counter, which is made in reuse_ksm_page(). OK, this as well. Honza > > > + if (PageKsm(page) || page_count(page) != 1) > > + goto copy; > > + if (!trylock_page(page)) > > + goto copy; > > + if (PageKsm(page) || page_mapcount(page) != 1 || page_count(page) != 1) { > > + unlock_page(page); > > goto copy; > > - if (!trylock_page(vmf->page)) { > > - get_page(vmf->page); > > - pte_unmap_unlock(vmf->pte, vmf->ptl); > > - lock_page(vmf->page); > > - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, > > - vmf->address, &vmf->ptl); > > - if (!pte_same(*vmf->pte, vmf->orig_pte)) { > > - update_mmu_tlb(vma, vmf->address, vmf->pte); > > - unlock_page(vmf->page); > > - pte_unmap_unlock(vmf->pte, vmf->ptl); > > - put_page(vmf->page); > > - return 0; > > - } > > - put_page(vmf->page); > > - } > > - if (PageKsm(vmf->page)) { > > - bool reused = reuse_ksm_page(vmf->page, vmf->vma, > > - vmf->address); > > - unlock_page(vmf->page); > > - if (!reused) > > - goto copy; > > - wp_page_reuse(vmf); > > - return VM_FAULT_WRITE; > > - } > > - if (reuse_swap_page(vmf->page, &total_map_swapcount)) { > > - if (total_map_swapcount == 1) { > > - /* > > - * The page is all ours. Move it to > > - * our anon_vma so the rmap code will > > - * not search our parent or siblings. > > - * Protected against the rmap code by > > - * the page lock. > > - */ > > - page_move_anon_rmap(vmf->page, vma); > > - } > > - unlock_page(vmf->page); > > - wp_page_reuse(vmf); > > - return VM_FAULT_WRITE; > > } > > - unlock_page(vmf->page); > > + /* > > + * Ok, we've got the only map reference, and the only > > + * page count reference, and the page is locked, > > + * it's dark out, and we're wearing sunglasses. Hit it. > > + */ > > + wp_page_reuse(vmf); > > + unlock_page(page); > > + return VM_FAULT_WRITE; > > } else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) == > > (VM_WRITE|VM_SHARED))) { > > return wp_page_shared(vmf); > > > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR