On Mon, Dec 30, 2024 at 6:53 PM Rik van Riel <riel@xxxxxxxxxxx> wrote: > Use the INVLPGB_FINAL_ONLY flag when invalidating mappings with INVPLGB. > This way only leaf mappings get removed from the TLB, leaving intermediate > translations cached. > > On the (rare) occasions where we free page tables we do a full flush, > ensuring intermediate translations get flushed from the TLB. > > Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx> > --- > arch/x86/include/asm/invlpgb.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/invlpgb.h b/arch/x86/include/asm/invlpgb.h > index 862775897a54..2669ebfffe81 100644 > --- a/arch/x86/include/asm/invlpgb.h > +++ b/arch/x86/include/asm/invlpgb.h > @@ -51,7 +51,7 @@ static inline void invlpgb_flush_user(unsigned long pcid, > static inline void invlpgb_flush_user_nr(unsigned long pcid, unsigned long addr, > int nr, bool pmd_stride) > { > - __invlpgb(0, pcid, addr, nr - 1, pmd_stride, INVLPGB_PCID | INVLPGB_VA); > + __invlpgb(0, pcid, addr, nr - 1, pmd_stride, INVLPGB_PCID | INVLPGB_VA | INVLPGB_FINAL_ONLY); > } Please note this final-only behavior in a comment above the function and/or rename the function to make this clear. I think this currently interacts badly with pmdp_collapse_flush(), which is used by retract_page_tables(). pmdp_collapse_flush() removes a PMD entry pointing to a page table with pmdp_huge_get_and_clear(), then calls flush_tlb_range(), which on x86 calls flush_tlb_mm_range() with the "freed_tables" parameter set to false. But that's really a preexisting bug, not something introduced by your series. I've sent a patch for that, see <https://lore.kernel.org/r/20250103-x86-collapse-flush-fix-v1-1-3c521856cfa6@xxxxxxxxxx>.