On Wed, Aug 31, 2022 at 10:55:43AM -0700, Yang Shi wrote: > On Wed, Aug 31, 2022 at 1:30 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > > > The comment is stale, because a TLB flush is no longer sufficient and > > required to synchronize against concurrent GUP-fast. This used to be true > > in the past, whereby a TLB flush would have implied an IPI on architectures > > that support GUP-fast, resulting in GUP-fast that disables local interrupts > > from completing before completing the flush. > > Hmm... it seems there might be problem for THP collapse IIUC. THP > collapse clears and flushes pmd before doing anything on pte and > relies on interrupt disable of fast GUP to serialize against fast GUP. > But if TLB flush is no longer sufficient, then we may run into the > below race IIUC: > > CPU A CPU B > THP collapse fast GUP > > gup_pmd_range() <-- see valid pmd > > gup_pte_range() <-- work on pte > clear pmd and flush TLB > __collapse_huge_page_isolate() > isolate page <-- before GUP bump refcount > > pin the page > __collapse_huge_page_copy() > copy data to huge page > clear pte (don't flush TLB) > Install huge pmd for huge page > > return the obsolete page Maybe the pmd level tlb flush is still needed, but on pte level it's optional (where we can rely on fast-gup rechecking on the pte change)? -- Peter Xu