On Sun, 2012-10-21 at 05:56 -0700, tip-bot for Andrea Arcangeli wrote: > In get_user_pages_fast() the TLB shootdown code can clear the pagetables > before firing any TLB flush (the page can't be freed until the TLB > flushing IPI has been delivered but the pagetables will be cleared well > before sending any TLB flushing IPI). I think we want to do this for all gup_fast() implementations. When I reported this issue I also proposed adding something like page_table_deref() which we could use through-out. Not sure we want to, but at least all archs need an audit for this. > --- > arch/x86/mm/gup.c | 23 ++++++++++++++++++++--- > mm/memory.c | 2 +- > 2 files changed, 21 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c > index dd74e46..6dc9921 100644 > --- a/arch/x86/mm/gup.c > +++ b/arch/x86/mm/gup.c > @@ -150,7 +150,13 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, > > pmdp = pmd_offset(&pud, addr); > do { > - pmd_t pmd = *pmdp; > + /* > + * With THP and hugetlbfs the pmd can change from > + * under us and it can be cleared as well by the TLB > + * shootdown, so read it with ACCESS_ONCE to do all > + * computations on the same sampling. > + */ > + pmd_t pmd = ACCESS_ONCE(*pmdp); > > next = pmd_addr_end(addr, end); > /* > @@ -220,7 +226,13 @@ static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end, > > pudp = pud_offset(&pgd, addr); > do { > - pud_t pud = *pudp; > + /* > + * With hugetlbfs giga pages the pud can change from > + * under us and it can be cleared as well by the TLB > + * shootdown, so read it with ACCESS_ONCE to do all > + * computations on the same sampling. > + */ > + pud_t pud = ACCESS_ONCE(*pudp); > > next = pud_addr_end(addr, end); > if (pud_none(pud)) > @@ -280,7 +292,12 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, > local_irq_save(flags); > pgdp = pgd_offset(mm, addr); > do { > - pgd_t pgd = *pgdp; > + /* > + * The pgd could be cleared by the TLB shootdown from > + * under us so read it with ACCESS_ONCE to do all > + * computations on the same sampling. > + */ > + pgd_t pgd = ACCESS_ONCE(*pgdp); > > next = pgd_addr_end(addr, end); > if (pgd_none(pgd)) > diff --git a/mm/memory.c b/mm/memory.c > index cc8e280..c0de477 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3555,7 +3555,7 @@ int handle_pte_fault(struct mm_struct *mm, > pte_t entry; > spinlock_t *ptl; > > - entry = *pte; > + entry = ACCESS_ONCE(*pte); > if (!pte_present(entry)) { > if (pte_none(entry)) { > if (vma->vm_ops) { -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html