On Wed, Aug 31, 2022 at 01:38:21PM -0700, Yang Shi wrote: > On Wed, Aug 31, 2022 at 11:52 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > On Wed, Aug 31, 2022 at 10:55:43AM -0700, Yang Shi wrote: > > > On Wed, Aug 31, 2022 at 1:30 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > > > > > > > The comment is stale, because a TLB flush is no longer sufficient and > > > > required to synchronize against concurrent GUP-fast. This used to be true > > > > in the past, whereby a TLB flush would have implied an IPI on architectures > > > > that support GUP-fast, resulting in GUP-fast that disables local interrupts > > > > from completing before completing the flush. > > > > > > Hmm... it seems there might be problem for THP collapse IIUC. THP > > > collapse clears and flushes pmd before doing anything on pte and > > > relies on interrupt disable of fast GUP to serialize against fast GUP. > > > But if TLB flush is no longer sufficient, then we may run into the > > > below race IIUC: > > > > > > CPU A CPU B > > > THP collapse fast GUP > > > > > > gup_pmd_range() <-- see valid pmd > > > > > > gup_pte_range() <-- work on pte > > > clear pmd and flush TLB > > > __collapse_huge_page_isolate() > > > isolate page <-- before GUP bump refcount > > > > > > pin the page > > > __collapse_huge_page_copy() > > > copy data to huge page > > > clear pte (don't flush TLB) > > > Install huge pmd for huge page > > > > > > return the obsolete page > > > > Maybe the pmd level tlb flush is still needed, but on pte level it's > > optional (where we can rely on fast-gup rechecking on the pte change)? > > Do you mean in khugepaged? What I wanted to say before was that the immediate tlb flush (after pgtable entry cleared) seems to be only needed by pmd level to guarantee safety with concurrent fast-gup, since fast-gup can detect pte change after pinning, and that should already guarantee safe concurrent fast-gup to me. After reading the other emails, afaiu we're on the same page. > It does TLB flush, but some arches may not use IPI. Yeah, I see that ppc book3s code has customized pmdp_collapse_flush() to explicit do the IPIs besides tlb flush using smp calls. I assume pmdp_collapse_flush() should always be properly implemented to guarantee safety against fast-gup, or I also agree it could be a bug. -- Peter Xu