On Thu, Aug 19, 2021 at 11:18:55AM +0800, Qi Zheng wrote: > diff --git a/mm/gup.c b/mm/gup.c > index 2630ed1bb4f4..30757f3b176c 100644 > +++ b/mm/gup.c > @@ -500,6 +500,9 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, > if (unlikely(pmd_bad(*pmd))) > return no_page_table(vma, flags); > > + if (!pte_try_get(mm, pmd)) > + return no_page_table(vma, flags); > + > ptep = pte_offset_map_lock(mm, pmd, address, &ptl); This is not good on a performance path, the pte_try_get() is locking/locking the same lock that pte_offset_map_lock() is getting. This would be much better if the map_lock infra could manage the refcount itself. I'm also not really keen on adding ptl level locking to all the currently no-lock paths. If we are doing that then the no-lock paths should rely on the ptl for alot more of their operations and avoid the complicatred no-lock data access we have. eg 'pte_try_get()' should also copy the pte_t under the lock. Also, I don't really understand how this scheme works with get_user_pages_fast. Currently the zap triggers a TLB invalidation which synchronizes with GUP fast, however this only makes the ptes non-present. The purpose is to synchronize with the struct page refcount, not a pte refcount. With this series the non-present PTEs are freed but how does this synchronize with gup fast to avoid a use-after-free on the pte struct page? I agree with David, this series needs significant splitting to be readable and a lot more explanation in the commit messages how all the locking is working. Eg introducing the freeing should be a single short patch at at end with a full explanation of the locking in all the major scenarios. Jason