Re: [PATCH v3 3/4] mm: don't expose non-hugetlb page to fast gup prematurely

Yu Zhao <yuzhao@xxxxxxxxxx> · Fri, 27 Sep 2019 12:31:16 -0600

On Fri, Sep 27, 2019 at 02:33:00PM +0200, Michal Hocko wrote:
> On Thu 26-09-19 20:26:46, John Hubbard wrote:
> > On 9/26/19 3:20 AM, Kirill A. Shutemov wrote:
> > > BTW, have you looked at other levels of page table hierarchy. Do we have
> > > the same issue for PMD/PUD/... pages?
> > > 
> > 
> > Along the lines of "what other memory barriers might be missing for
> > get_user_pages_fast(), I'm also concerned that the synchronization between
> > get_user_pages_fast() and freeing the page tables might be technically broken,
> > due to missing memory barriers on the get_user_pages_fast() side. Details:
> > 
> > gup_fast() disables interrupts, but I think it also needs some sort of
> > memory barrier(s), in order to prevent reads of the page table (gup_pgd_range,
> > etc) from speculatively happening before the interrupts are disabled. 
> 
> Could you be more specific about the race scenario please? I thought
> that the unmap path will be serialized by the pte lock.

Yes, the unmap path is protected by ptl, but the fast gup isn't.
Please correct me if I'm wrong, John. This is the hypothetical race:

CPU 1 (gup)				CPU 2 (zap)
speculatively load a pmd val
					zap the pte table pointed by the pmd val
					flush tlb by ipi
<handle ipi>
					free the pte table
local_irq_disable()
use the stale pmd val
use-after-free the pte table
local_irq_enable()

I don't think it would happen because the interrupt context on CPU 1
would act as a full mb and enforce a reload of the pmd val. But I'm
not entirely sure.