On 01.09.21 15:53, Jason Gunthorpe wrote:
On Thu, Aug 19, 2021 at 11:18:55AM +0800, Qi Zheng wrote:
diff --git a/mm/gup.c b/mm/gup.c
index 2630ed1bb4f4..30757f3b176c 100644
+++ b/mm/gup.c
@@ -500,6 +500,9 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
if (unlikely(pmd_bad(*pmd)))
return no_page_table(vma, flags);
+ if (!pte_try_get(mm, pmd))
+ return no_page_table(vma, flags);
+
ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
This is not good on a performance path, the pte_try_get() is
locking/locking the same lock that pte_offset_map_lock() is getting.
Yes, and we really need patch #8, anything else is just confusing reviewers.
This would be much better if the map_lock infra could manage the
refcount itself.
I'm also not really keen on adding ptl level locking to all the
currently no-lock paths. If we are doing that then the no-lock paths
should rely on the ptl for alot more of their operations and avoid the
complicatred no-lock data access we have. eg 'pte_try_get()' should
also copy the pte_t under the lock.
Also, I don't really understand how this scheme works with
get_user_pages_fast.
With the RCU change it in #8 it should work just fine, because RCU
synchronize has to wait either until all other CPUs have left the RCU
read section, or re-enabled interrupts.
--
Thanks,
David / dhildenb