The patch titled Subject: mm/vmalloc: prevent stale TLBs in fully utilized blocks has been added to the -mm mm-unstable branch. Its filename is mm-vmalloc-prevent-stale-tlbs-in-fully-utilized-blocks.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-vmalloc-prevent-stale-tlbs-in-fully-utilized-blocks.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Subject: mm/vmalloc: prevent stale TLBs in fully utilized blocks Date: Thu, 25 May 2023 14:57:03 +0200 (CEST) Patch series "mm/vmalloc: Assorted fixes and improvements", v2. this series addresses the following issues: 1) Prevent the stale TLB problem related to fully utilized vmap blocks 2) Avoid the double per CPU list walk in _vm_unmap_aliases() 3) Avoid flushing dirty space over and over 4) Add a lockless quickcheck in vb_alloc() and add missing READ/WRITE_ONCE() annotations 5) Prevent overeager purging of usable vmap_blocks if not under memory/address space pressure. This patch (of 6): _vm_unmap_aliases() is used to ensure that no unflushed TLB entries for a page are left in the system. This is required due to the lazy TLB flush mechanism in vmalloc. This is tried to achieve by walking the per CPU free lists, but those do not contain fully utilized vmap blocks because they are removed from the free list once the blocks free space became zero. When the block is not fully unmapped then it is not on the purge list either. So neither the per CPU list iteration nor the purge list walk find the block and if the page was mapped via such a block and the TLB has not yet been flushed, the guarantee of _vm_unmap_aliases() that there are no stale TLBs after returning is broken: x = vb_alloc() // Removes vmap_block from free list because vb->free became 0 vb_free(x) // Unmaps page and marks in dirty_min/max range // Block has still mappings and is not put on purge list // Page is reused vm_unmap_aliases() // Can't find vmap block with the dirty space -> FAIL So instead of walking the per CPU free lists, walk the per CPU xarrays which hold pointers to _all_ active blocks in the system including those removed from the free lists. Link: https://lkml.kernel.org/r/20230525122342.109672430@xxxxxxxxxxxxx Link: https://lkml.kernel.org/r/20230525124504.573987880@xxxxxxxxxxxxx Fixes: db64fe02258f ("mm: rewrite vmap layer") Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx> Reviewed-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> Cc: Baoquan He <bhe@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/vmalloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/vmalloc.c~mm-vmalloc-prevent-stale-tlbs-in-fully-utilized-blocks +++ a/mm/vmalloc.c @@ -2236,9 +2236,10 @@ static void _vm_unmap_aliases(unsigned l for_each_possible_cpu(cpu) { struct vmap_block_queue *vbq = &per_cpu(vmap_block_queue, cpu); struct vmap_block *vb; + unsigned long idx; rcu_read_lock(); - list_for_each_entry_rcu(vb, &vbq->free, free_list) { + xa_for_each(&vbq->vmap_blocks, idx, vb) { spin_lock(&vb->lock); if (vb->dirty && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; _ Patches currently in -mm which might be from tglx@xxxxxxxxxxxxx are mm-vmalloc-prevent-stale-tlbs-in-fully-utilized-blocks.patch mm-vmalloc-avoid-iterating-over-per-cpu-vmap-blocks-twice.patch mm-vmalloc-prevent-flushing-dirty-space-over-and-over.patch mm-vmalloc-check-free-space-in-vmap_block-lockless.patch mm-vmalloc-add-missing-read-write_once-annotations.patch mm-vmalloc-dont-purge-usable-blocks-unnecessarily.patch