On 7/27/23 14:28, David Hildenbrand wrote:
We accidentally enforced PROT_NONE PTE/PMD permission checks for
follow_page() like we do for get_user_pages() and friends. That was
undesired, because follow_page() is usually only used to lookup a currently
mapped page, not to actually access it. Further, follow_page() does not
actually trigger fault handling, but instead simply fails.
I see that follow_page() is also completely undocumented. And that
reduces us to deducing how it should be used...these things that
change follow_page()'s behavior maybe should have a go at documenting
it too, perhaps.
Let's restore that behavior by conditionally setting FOLL_FORCE if
FOLL_WRITE is not set. This way, for example KSM and migration code will
no longer fail on PROT_NONE mapped PTEs/PMDS.
Handling this internally doesn't require us to add any new FOLL_FORCE
usage outside of GUP code.
While at it, refuse to accept FOLL_FORCE: we don't even perform VMA
permission checks like in check_vma_flags(), so especially
FOLL_FORCE|FOLL_WRITE would be dodgy.
This issue was identified by code inspection. We'll add some
documentation regarding FOLL_FORCE next.
Reported-by: Peter Xu <peterx@xxxxxxxxxx>
Fixes: 474098edac26 ("mm/gup: replace FOLL_NUMA by gup_can_follow_protnone()")
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
mm/gup.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/mm/gup.c b/mm/gup.c
index 2493ffa10f4b..da9a5cc096ac 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -841,9 +841,17 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
if (vma_is_secretmem(vma))
return NULL;
- if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
+ if (WARN_ON_ONCE(foll_flags & (FOLL_PIN | FOLL_FORCE)))
return NULL;
This is not a super happy situation: follow_page() is now prohibited
(see above: we should document that interface) from passing in
FOLL_FORCE...
+ /*
+ * Traditionally, follow_page() succeeded on PROT_NONE-mapped pages
+ * but failed follow_page(FOLL_WRITE) on R/O-mapped pages. Let's
+ * keep these semantics by setting FOLL_FORCE if FOLL_WRITE is not set.
+ */
+ if (!(foll_flags & FOLL_WRITE))
+ foll_flags |= FOLL_FORCE;
+
...but then we set it anyway, for special cases. It's awkward because
FOLL_FORCE is not an "internal to gup" flag (yet?).
I don't yet have suggestions, other than:
1) Yes, the FOLL_NUMA made things bad.
2) And they are still very confusing, especially the new use of
FOLL_FORCE.
...I'll try to let this soak in and maybe recommend something
in a more productive way. :)
thanks,
--
John Hubbard
NVIDIA