On 3/29/22 18:04, David Hildenbrand wrote: > Let's verify when (un)pinning anonymous pages that we always deal with > exclusive anonymous pages, which guarantees that we'll have a reliable > PIN, meaning that we cannot end up with the GUP pin being inconsistent > with he pages mapped into the page tables due to a COW triggered > by a write fault. > > When pinning pages, after conditionally triggering GUP unsharing of > possibly shared anonymous pages, we should always only see exclusive > anonymous pages. Note that anonymous pages that are mapped writable > must be marked exclusive, otherwise we'd have a BUG. > > When pinning during ordinary GUP, simply add a check after our > conditional GUP-triggered unsharing checks. As we know exactly how the > page is mapped, we know exactly in which page we have to check for > PageAnonExclusive(). > > When pinning via GUP-fast we have to be careful, because we can race with > fork(): verify only after we made sure via the seqcount that we didn't > race with concurrent fork() that we didn't end up pinning a possibly > shared anonymous page. > > Similarly, when unpinning, verify that the pages are still marked as > exclusive: otherwise something turned the pages possibly shared, which > can result in random memory corruptions, which we really want to catch. > > With only the pinned pages at hand and not the actual page table entries > we have to be a bit careful: hugetlb pages are always mapped via a > single logical page table entry referencing the head page and > PG_anon_exclusive of the head page applies. Anon THP are a bit more > complicated, because we might have obtained the page reference either via > a PMD or a PTE -- depending on the mapping type we either have to check > PageAnonExclusive of the head page (PMD-mapped THP) or the tail page > (PTE-mapped THP) applies: as we don't know and to make our life easier, > check that either is set. > > Take care to not verify in case we're unpinning during GUP-fast because > we detected concurrent fork(): we might stumble over an anonymous page > that is now shared. > > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> Nits: > @@ -510,6 +563,10 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, > page = ERR_PTR(-EMLINK); > goto out; > } > + > + VM_BUG_ON((flags & FOLL_PIN) && PageAnon(page) && > + !PageAnonExclusive(page)); Do we rather want VM_BUG_ON_PAGE? Also for the same tests in mm/huge*.c below. > + > /* try_grab_page() does nothing unless FOLL_GET or FOLL_PIN is set. */ > if (unlikely(!try_grab_page(page, flags))) { > page = ERR_PTR(-ENOMEM); > @@ -2744,8 +2801,10 @@ static unsigned long lockless_pages_from_mm(unsigned long start, > */ > if (gup_flags & FOLL_PIN) { > if (read_seqcount_retry(¤t->mm->write_protect_seq, seq)) { > - unpin_user_pages(pages, nr_pinned); > + unpin_user_pages_lockless(pages, nr_pinned); > return 0; > + } else { > + sanity_check_pinned_pages(pages, nr_pinned); > } > } > return nr_pinned; > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 2dc820e8c873..b32774f289d6 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1392,6 +1392,9 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, > if (!pmd_write(*pmd) && gup_must_unshare(flags, page)) > return ERR_PTR(-EMLINK); > > + VM_BUG_ON((flags & FOLL_PIN) && PageAnon(page) && > + !PageAnonExclusive(page)); > + > if (!try_grab_page(page, flags)) > return ERR_PTR(-ENOMEM); > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 21f2ec446117..48740e6c3476 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -6097,6 +6097,9 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, > pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; > page = pte_page(huge_ptep_get(pte)); > > + VM_BUG_ON((flags & FOLL_PIN) && PageAnon(page) && > + !PageAnonExclusive(page)); > + > /* > * If subpage information not requested, update counters > * and skip the same_page loop below.