On Fri, 2025-01-03 at 18:49 +0100, Jann Horn wrote: > On Mon, Dec 30, 2024 at 6:53 PM Rik van Riel <riel@xxxxxxxxxxx> > > only those upper-level entries that lead to the target PTE in > > the page table hierarchy, leaving unrelated upper-level entries > > intact. > > How does this patch interact with KVM SVM guests? > In particular, will this patch cause TLB flushes performed by guest > kernels to behave differently? > That is a good question. A Linux guest should be fine, since Linux already flushes the parts of the TLB where page tables are being freed. I don't know whether this could potentially break some non-Linux guests, though. > [...] > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > > index 454a370494d3..585d0731ca9f 100644 > > --- a/arch/x86/mm/tlb.c > > +++ b/arch/x86/mm/tlb.c > > @@ -477,7 +477,7 @@ static void broadcast_tlb_flush(struct > > flush_tlb_info *info) > > if (info->stride_shift > PMD_SHIFT) > > maxnr = 1; > > > > - if (info->end == TLB_FLUSH_ALL) { > > + if (info->end == TLB_FLUSH_ALL || info->freed_tables) { > > invlpgb_flush_single_pcid(kern_pcid(asid)); > > /* Do any CPUs supporting INVLPGB need PTI? */ > > if (static_cpu_has(X86_FEATURE_PTI)) > > @@ -1110,7 +1110,7 @@ static void flush_tlb_func(void *info) > > * > > * The only question is whether to do a full or partial > > flush. > > * > > - * We do a partial flush if requested and two extra > > conditions > > + * We do a partial flush if requested and three extra > > conditions > > * are met: > > * > > * 1. f->new_tlb_gen == local_tlb_gen + 1. We have an > > invariant that > > @@ -1137,10 +1137,14 @@ static void flush_tlb_func(void *info) > > * date. By doing a full flush instead, we can increase > > * local_tlb_gen all the way to mm_tlb_gen and we can > > probably > > * avoid another flush in the very near future. > > + * > > + * 3. No page tables were freed. If page tables were freed, > > a full > > + * flush ensures intermediate translations in the TLB > > get flushed. > > */ > > Why is this necessary - do we ever issue TLB flushes that are > intended > to zap upper-level entries which are not covered by the specified > address range? > > When, for example, free_pmd_range() gets rid of a page table, it > calls > pmd_free_tlb(), which sets tlb->freed_tables and does > tlb_flush_pud_range(tlb, address, PAGE_SIZE). > I missed those calls. It looks like this change is not needed. Of course, the way pmd_free_tlb() operates, the partial zaps done in that code will exceed the (default 33 pages) value of tlb_single_page_flush_ceiling, and the code in flush_tlb_mm_range() will already do a full flush by default today. I'll leave out these unnecessary changes in the next version. -- All Rights Reversed.