It turns out that Linux TLB batching mechanism suffers from various races. Races that are caused due to batching during reclamation were recently handled by Mel and this patch-set deals with others. The more fundamental issue is that concurrent updates of the page-tables allow for TLB flushes to be batched on one core, while another core changes the page-tables. This other core may assume a PTE change does not require a flush based on the updated PTE value, while it is unaware that TLB flushes are still pending. This behavior affects KSM (which may result in memory corruption) and MADV_FREE and MADV_DONTNEED (which may result in incorrect behavior). A proof-of-concept can easily produce the wrong behavior of MADV_DONTNEED. Memory corruption in KSM is harder to produce in practice, but was observed by hacking the kernel and adding a delay before flushing and replacing the KSM page. Finally, there is also one memory barrier missing, which may affect architectures with weak memory model. v5 -> v6: * Combining with Minchan Kim's patch set, adding ack's (Andrew) * Minor: missing header, typos (Nadav) * Renaming arch_generic_tlb_finish_mmu (Mel) Michnan's v1 -> v2 (combined): * TLB batching API separation core part from arch specific one (Mel) * introduce mm_tlb_flush_nested (Mel) v4 -> v5: * Fixing embarrassing build mistake (0day) v3 -> v4: * Change function names to indicate they inc/dec and not set/clear (Sergey) * Avoid additional barriers, and instead revert the patch that accessed mm_tlb_flush_pending without a lock (Mel) v2 -> v3: * Do not init tlb_flush_pending if it is not defined without (Sergey) * Internalize memory barriers to mm_tlb_flush_pending (Minchan) v1 -> v2: * Explain the implications of the implications of the race (Andrew) * Mark the patch that address the race as stable (Andrew) * Add another patch to clean the use of barriers (Andrew) Minchan Kim (4): mm: refactoring TLB gathering API mm: make tlb_flush_pending global mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem mm: fix KSM data corruption Nadav Amit (3): mm: migrate: prevent racy access to tlb_flush_pending mm: migrate: fix barriers around tlb_flush_pending Revert "mm: numa: defer TLB flush for THP migration as long as possible" arch/arm/include/asm/tlb.h | 11 ++++++-- arch/ia64/include/asm/tlb.h | 8 ++++-- arch/s390/include/asm/tlb.h | 17 +++++++----- arch/sh/include/asm/tlb.h | 8 +++--- arch/um/include/asm/tlb.h | 13 ++++++--- fs/proc/task_mmu.c | 7 +++-- include/asm-generic/tlb.h | 7 ++--- include/linux/mm_types.h | 64 +++++++++++++++++++++++++++------------------ kernel/fork.c | 2 +- mm/debug.c | 4 +-- mm/huge_memory.c | 7 +++++ mm/ksm.c | 3 ++- mm/memory.c | 41 ++++++++++++++++++++++++----- mm/migrate.c | 6 ----- mm/mprotect.c | 4 +-- 15 files changed, 135 insertions(+), 67 deletions(-) -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>