Re: [PATCH v2] mm: Optimized hugepage zeroing & copying from user

Prathu Baronia <prathu.baronia@xxxxxxxxxxx> · Tue, 21 Apr 2020 15:06:21 +0530

With below v2 patch we observe a significantly(~65%) improved zeroing time for
hugepages.

We profiled the clear_huge_page() using ftrace on Qualcomm's SM8150 platform
under controlled conditions(i.e. only CPU0 and 6 turned on and set to max
frequency, and DDR set to performance governor).

The existing method uses a reverse traversal of a section of a hugepage which
based on our series of experiments proves slower than a oneshot(v2) approach on
ARM64.(more details in mail thread)

We didn't see any benefit on x86 so v2 probably won't find any place in the main
memory.c code.

We are currently thinking of making this optimization ARM64 specific for better
performance by placing this in arch/arm64/mm/memory.c(to be created) file. We
would really appreciate if you can share your opinion on this.

-- 
Prathu Baronia
OnePlus RnD