From: Kalesh Singh <kaleshsingh@xxxxxxxxxx> Subject: arm64: mremap speedup - enable HAVE_MOVE_PUD HAVE_MOVE_PUD enables remapping pages at the PUD level if both the source and destination addresses are PUD-aligned. With HAVE_MOVE_PUD enabled it can be inferred that there is approximately a 19x improvement in performance on arm64. (See data below). ------- Test Results --------- The following results were obtained using a 5.4 kernel, by remapping a PUD-aligned, 1GB sized region to a PUD-aligned destination. The results from 10 iterations of the test are given below: Total mremap times for 1GB data on arm64. All times are in nanoseconds. Control HAVE_MOVE_PUD 1247761 74271 1219896 46771 1094792 59687 1227760 48385 1043698 76666 1101771 50365 1159896 52500 1143594 75261 1025833 61354 1078125 48697 1134312.6 59395.7 <-- Mean time in nanoseconds A 1GB mremap completion time drops from ~1.1 milliseconds to ~59 microseconds on arm64. (~19x speed up). Link: https://lkml.kernel.org/r/20201014005320.2233162-5-kaleshsingh@xxxxxxxxxx Signed-off-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx> Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx> Cc: Arnd Bergmann <arnd@xxxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Cc: Brian Geffon <bgeffon@xxxxxxxxxx> Cc: Christian Brauner <christian.brauner@xxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> Cc: Gavin Shan <gshan@xxxxxxxxxx> Cc: Hassan Naveed <hnaveed@xxxxxxxxxxxx> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Jia He <justin.he@xxxxxxx> Cc: John Hubbard <jhubbard@xxxxxxxxxx> Cc: Kees Cook <keescook@xxxxxxxxxxxx> Cc: Krzysztof Kozlowski <krzk@xxxxxxxxxx> Cc: Lokesh Gidra <lokeshgidra@xxxxxxxxxx> Cc: Mark Rutland <mark.rutland@xxxxxxx> Cc: Masahiro Yamada <masahiroy@xxxxxxxxxx> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx> Cc: Mike Rapoport <rppt@xxxxxxxxxx> Cc: Mina Almasry <almasrymina@xxxxxxxxxx> Cc: Minchan Kim <minchan@xxxxxxxxxx> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Cc: Ralph Campbell <rcampbell@xxxxxxxxxx> Cc: Ram Pai <linuxram@xxxxxxxxxx> Cc: Sami Tolvanen <samitolvanen@xxxxxxxxxx> Cc: Sandipan Das <sandipan@xxxxxxxxxxxxx> Cc: SeongJae Park <sjpark@xxxxxxxxx> Cc: Shuah Khan <shuah@xxxxxxxxxx> Cc: Steven Price <steven.price@xxxxxxx> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Zi Yan <ziy@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgtable.h | 1 + 2 files changed, 2 insertions(+) --- a/arch/arm64/include/asm/pgtable.h~arm64-mremap-speedup-enable-have_move_pud +++ a/arch/arm64/include/asm/pgtable.h @@ -462,6 +462,7 @@ static inline pmd_t pmd_mkdevmap(pmd_t p #define pfn_pud(pfn,prot) __pud(__phys_to_pud_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)) #define set_pmd_at(mm, addr, pmdp, pmd) set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd)) +#define set_pud_at(mm, addr, pudp, pud) set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud)) #define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d)) #define __phys_to_p4d_val(phys) __phys_to_pte_val(phys) --- a/arch/arm64/Kconfig~arm64-mremap-speedup-enable-have_move_pud +++ a/arch/arm64/Kconfig @@ -125,6 +125,7 @@ config ARM64 select HANDLE_DOMAIN_IRQ select HARDIRQS_SW_RESEND select HAVE_MOVE_PMD + select HAVE_MOVE_PUD select HAVE_PCI select HAVE_ACPI_APEI if (ACPI && EFI) select HAVE_ALIGNED_STRUCT_PAGE if SLUB _