THP allocation for anon memory requires zeroing of the huge page. To do so, we iterate over 2MB memory in 4KB chunks. Each iteration calls for kmap_atomic() and kunmap_atomic(). This routine makes sense where we need temporary mapping of the user page. In !HIGHMEM cases, specially in 64-bit architectures, we don't need temp mapping. Hence, kmap_atomic() acts as nothing more than multiple barrier() calls. This called for optimization. Simply getting VADDR from page does the job for us. So, implement another (optimized) routine for clear_huge_page() which doesn't need temporary mapping of user space page. While testing this patch on Qualcomm SM8150 SoC (kernel v4.14.117), we see 64% Improvement in clear_huge_page(). Ftrace results: Default profile: ------------------------------------------ 6) ! 473.802 us | clear_huge_page(); ------------------------------------------ With this patch applied: ------------------------------------------ 5) ! 170.156 us | clear_huge_page(); ------------------------------------------ Signed-off-by: Prathu Baronia <prathu.baronia@xxxxxxxxxxx> Reported-by: Chintan Pandya <chintan.pandya@xxxxxxxxxxx> --- mm/memory.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 3ee073d..3e120e8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5119,6 +5119,7 @@ EXPORT_SYMBOL(__might_fault); #endif #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) +#ifdef CONFIG_HIGHMEM static void clear_gigantic_page(struct page *page, unsigned long addr, unsigned int pages_per_huge_page) @@ -5183,6 +5184,16 @@ void clear_huge_page(struct page *page, addr + right_idx * PAGE_SIZE); } } +#else +void clear_huge_page(struct page *page, + unsigned long addr_hint, unsigned int pages_per_huge_page) +{ + void *addr; + + addr = page_address(page); + memset(addr, 0, pages_per_huge_page*PAGE_SIZE); +} +#endif static void copy_user_gigantic_page(struct page *dst, struct page *src, unsigned long addr, -- 2.7.4 -- Prathu Baronia