From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> Clearing a 2MB huge page will typically blow away several levels of CPU caches. To avoid this only cache clear the 4K area around the fault address and use a cache avoiding clears for the rest of the 2MB area. This patchset implements cache avoiding version of clear_page only for x86. If an architecture wants to provide cache avoiding version of clear_page it should to define ARCH_HAS_USER_NOCACHE to 1 and implement clear_page_nocache() and clear_user_highpage_nocache(). v2: - No code change. Only commit messages are updated. - RFC mark is dropped. Andi Kleen (6): THP: Use real address for NUMA policy mm: make clear_huge_page tolerate non aligned address THP: Pass real, not rounded, address to clear_huge_page x86: Add clear_page_nocache mm: make clear_huge_page cache clear only around the fault address x86: switch the 64bit uncached page clear to SSE/AVX v2 arch/x86/include/asm/page.h | 2 + arch/x86/include/asm/string_32.h | 5 ++ arch/x86/include/asm/string_64.h | 5 ++ arch/x86/lib/Makefile | 1 + arch/x86/lib/clear_page_nocache_32.S | 30 +++++++++++ arch/x86/lib/clear_page_nocache_64.S | 92 ++++++++++++++++++++++++++++++++++ arch/x86/mm/fault.c | 7 +++ mm/huge_memory.c | 17 +++--- mm/memory.c | 29 ++++++++++- 9 files changed, 178 insertions(+), 10 deletions(-) create mode 100644 arch/x86/lib/clear_page_nocache_32.S create mode 100644 arch/x86/lib/clear_page_nocache_64.S -- 1.7.7.6