[PATCH v6 0/7] fixes of TLB batching races

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It turns out that Linux TLB batching mechanism suffers from various races.
Races that are caused due to batching during reclamation were recently
handled by Mel and this patch-set deals with others. The more fundamental
issue is that concurrent updates of the page-tables allow for TLB flushes
to be batched on one core, while another core changes the page-tables.
This other core may assume a PTE change does not require a flush based on
the updated PTE value, while it is unaware that TLB flushes are still
pending.

This behavior affects KSM (which may result in memory corruption) and
MADV_FREE and MADV_DONTNEED (which may result in incorrect behavior). A
proof-of-concept can easily produce the wrong behavior of MADV_DONTNEED.
Memory corruption in KSM is harder to produce in practice, but was observed
by hacking the kernel and adding a delay before flushing and replacing the
KSM page.

Finally, there is also one memory barrier missing, which may affect
architectures with weak memory model.

v5 -> v6:
* Combining with Minchan Kim's patch set, adding ack's (Andrew)
* Minor: missing header, typos (Nadav)
* Renaming arch_generic_tlb_finish_mmu (Mel)

Michnan's v1 -> v2 (combined):
* TLB batching API separation core part from arch specific one (Mel)
* introduce mm_tlb_flush_nested (Mel)

v4 -> v5:
* Fixing embarrassing build mistake (0day)

v3 -> v4:
* Change function names to indicate they inc/dec and not set/clear
  (Sergey)
* Avoid additional barriers, and instead revert the patch that accessed
  mm_tlb_flush_pending without a lock (Mel)

v2 -> v3:
* Do not init tlb_flush_pending if it is not defined without (Sergey)
* Internalize memory barriers to mm_tlb_flush_pending (Minchan) 

v1 -> v2:
* Explain the implications of the implications of the race (Andrew)
* Mark the patch that address the race as stable (Andrew)
* Add another patch to clean the use of barriers (Andrew)

Minchan Kim (4):
  mm: refactoring TLB gathering API
  mm: make tlb_flush_pending global
  mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem
  mm: fix KSM data corruption

Nadav Amit (3):
  mm: migrate: prevent racy access to tlb_flush_pending
  mm: migrate: fix barriers around tlb_flush_pending
  Revert "mm: numa: defer TLB flush for THP migration as long as
    possible"

 arch/arm/include/asm/tlb.h  | 11 ++++++--
 arch/ia64/include/asm/tlb.h |  8 ++++--
 arch/s390/include/asm/tlb.h | 17 +++++++-----
 arch/sh/include/asm/tlb.h   |  8 +++---
 arch/um/include/asm/tlb.h   | 13 ++++++---
 fs/proc/task_mmu.c          |  7 +++--
 include/asm-generic/tlb.h   |  7 ++---
 include/linux/mm_types.h    | 64 +++++++++++++++++++++++++++------------------
 kernel/fork.c               |  2 +-
 mm/debug.c                  |  4 +--
 mm/huge_memory.c            |  7 +++++
 mm/ksm.c                    |  3 ++-
 mm/memory.c                 | 41 ++++++++++++++++++++++++-----
 mm/migrate.c                |  6 -----
 mm/mprotect.c               |  4 +--
 15 files changed, 135 insertions(+), 67 deletions(-)

-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux