On Sat, Jan 4, 2025 at 4:09 AM Rik van Riel <riel@xxxxxxxxxxx> wrote: > On Fri, 2025-01-03 at 23:11 +0100, Jann Horn wrote: > > On Fri, Jan 3, 2025 at 10:55 PM Rik van Riel <riel@xxxxxxxxxxx> > > wrote: > > > On Fri, 2025-01-03 at 19:39 +0100, Jann Horn wrote: > > > > 02fc2aa06e9e0ecdba3fe948cafe5892b72e86c0..3da645139748538daac7016 > > > > 6618d > > > > 8ad95116eb74 100644 > > > > --- a/arch/x86/include/asm/tlbflush.h > > > > +++ b/arch/x86/include/asm/tlbflush.h > > > > @@ -242,7 +242,7 @@ void flush_tlb_multi(const struct cpumask > > > > *cpumask, > > > > flush_tlb_mm_range((vma)->vm_mm, start, > > > > end, \ > > > > ((vma)->vm_flags & > > > > VM_HUGETLB) \ > > > > ? > > > > huge_page_shift(hstate_vma(vma)) \ > > > > - : PAGE_SHIFT, false) > > > > + : PAGE_SHIFT, true) > > > > > > > > > > > > > > The code looks good, but should this macro get > > > a comment indicating that code that only frees > > > pages, but not page tables, should be calling > > > flush_tlb() instead? > > > > Documentation/core-api/cachetlb.rst seems to be the common place > > that's supposed to document the rules - the macro I'm touching is > > just > > the x86 implementation. (The arm64 implementation also has some > > fairly > > extensive comments that say flush_tlb_range() "also invalidates any > > walk-cache entries associated with translations for the specified > > address range" while flush_tlb_page() "only invalidates a single, > > last-level page-table entry and therefore does not affect any > > walk-caches".) I wouldn't want to add yet more documentation for this > > API inside the X86 code. I guess it would make sense to add pointers > > from the x86 code to the documentation (and copy the details about > > last-level TLBs from the arm64 code into the docs). > > > > I don't see a function flush_tlb() outside of some (non-x86) arch > > code. > > I see zap_pte_range() calling tlb_flush_mmu(), > which calls tlb_flush_mmu_tlbonly() in include/asm-generic/tlb.h, > which in turn calls tlb_flush(). > > The asm-generic version of tlb_flush() goes through > flush_tlb_mm(), which on x86 would call flush_tlb_mm_range > with flush_tables = true. > > Luckily x86 seems to have its own implementation of > tlb_flush(), which avoids that issue. Aah, right. Yeah, I think the tlb_flush() infrastructure with "struct mmu_gather" is probably one of the two really optimized TLB flushing hotpaths (the other one being the reclaim path). I think tlb_flush() is for somewhat different use cases though - my understanding is that it is mainly for operations that need batching and/or want to delay TLB flushes while dropping page table locks. > > I don't know if it makes sense to tell developers to not use > > flush_tlb_range() for freeing pages. If the performance of > > flush_tlb_range() actually is an issue, I guess one fix would be to > > refactor this and add a parameter or something? > > > > I don't know whether this is a real issue on > architectures other than x86. arm64 seems to have code specifically for doing flushes without affecting cached higher-level entries - __flush_tlb_range_nosync() receives a "last_level" parameter (which is plumbed through from the arm64 version of tlb_flush()) and picks "vale1is" or "vae1is" depending on it. > For now it looks like the code does the right > thing when only pages are being freed, so we > may not need that parameter. > > -- > All Rights Reversed.