Hi! I haven't actually managed to reproduce this behavior, so maybe I'm just misunderstanding how this works; but I think the arch_tlbbatch_flush() path for batched TLB flushing in vmscan ought to have some kind of integration with mm_tlb_flush_nested(). I think that currently, the following race could happen: [initial situation: page P is mapped into a page table of task B, but the page is not referenced, the PTE's A/D bits are clear] A: vmscan begins A: vmscan looks at P and P's PTEs, and concludes that P is not currently in use B: reads from P through the PTE, setting the Accessed bit and creating a TLB entry A: vmscan enters try_to_unmap_one() A: try_to_unmap_one() calls should_defer_flush(), which returns true A: try_to_unmap_one() removes the PTE and queues a TLB flush (arch_tlbbatch_add_mm()) A: try_to_unmap_one() returns, try_to_unmap() returns to shrink_folio_list() B: calls munmap() on the VMA that mapped P B: no PTEs are removed, so no TLB flush happens B: munmap() returns [at this point, the TLB entry still exists] B: calls mmap(), which reuses the same area that was just unmapped B: tries to access the newly created VMA, but instead the access goes through the stale TLB entry A: shrink_folio_list() calls try_to_unmap_flush(), which removes the stale TLB entry The effect would be that after process B removes a mapping with munmap() and creates a new mapping in its place, it would still see data from the old mapping when trying to access the new mapping. Am I missing something that protects against this scenario? munmap() uses the mmu_gather infrastructure, which tries to protect against this kind of correctness bug with multiple racing TLB invalidations in tlb_finish_mmu() by blowing away the whole TLB whenever one TLB invalidation ends while another is still in progress (tested with mm_tlb_flush_nested(tlb->mm)). But mmu_gather doesn't seem to be aware of TLB flushes that are batched up in the arch_tlbbatch_flush() infrastructure, so that doesn't help here. I think it might be necessary to add a new global counter of pending arch_tlbbatch_flush() flushes, and query that in mm_tlb_flush_nested(), or something like that.