On Wed, Aug 31, 2022 at 11:52 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Wed, Aug 31, 2022 at 10:55:43AM -0700, Yang Shi wrote: > > On Wed, Aug 31, 2022 at 1:30 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > > > > > The comment is stale, because a TLB flush is no longer sufficient and > > > required to synchronize against concurrent GUP-fast. This used to be true > > > in the past, whereby a TLB flush would have implied an IPI on architectures > > > that support GUP-fast, resulting in GUP-fast that disables local interrupts > > > from completing before completing the flush. > > > > Hmm... it seems there might be problem for THP collapse IIUC. THP > > collapse clears and flushes pmd before doing anything on pte and > > relies on interrupt disable of fast GUP to serialize against fast GUP. > > But if TLB flush is no longer sufficient, then we may run into the > > below race IIUC: > > > > CPU A CPU B > > THP collapse fast GUP > > > > gup_pmd_range() <-- see valid pmd > > > > gup_pte_range() <-- work on pte > > clear pmd and flush TLB > > __collapse_huge_page_isolate() > > isolate page <-- before GUP bump refcount > > > > pin the page > > __collapse_huge_page_copy() > > copy data to huge page > > clear pte (don't flush TLB) > > Install huge pmd for huge page > > > > return the obsolete page > > Maybe the pmd level tlb flush is still needed, but on pte level it's > optional (where we can rely on fast-gup rechecking on the pte change)? Do you mean in khugepaged? It does TLB flush, but some arches may not use IPI. > > -- > Peter Xu >