On 08/30/2016 09:09 AM, Andrew Morton wrote: > On Tue, 30 Aug 2016 11:09:15 +0800 Aaron Lu <aaron.lu@xxxxxxxxx> wrote: > >>>> Case used for test on Haswell EP: >>>> usemem -n 72 --readonly -j 0x200000 100G >>>> Which spawns 72 processes and each will mmap 100G anonymous space and >>>> then do read only access to that space sequentially with a step of 2MB. >>>> >>>> perf report for base commit: >>>> 54.03% usemem [kernel.kallsyms] [k] get_huge_zero_page >>>> perf report for this commit: >>>> 0.11% usemem [kernel.kallsyms] [k] mm_get_huge_zero_page >>> >>> Does this mean that overall usemem runtime halved? >> >> Sorry for the confusion, the above line is extracted from perf report. >> It shows the percent of CPU cycles executed in a specific function. >> >> The above two perf lines are used to show get_huge_zero_page doesn't >> consume that much CPU cycles after applying the patch. >> >>> >>> Do we have any numbers for something which is more real-wordly? >> >> Unfortunately, no real world numbers. >> >> We think the global atomic counter could be an issue for performance >> so I'm trying to solve the problem. > > So, umm, we don't actually know if the patch is useful to anyone? On a POWER system it improves the CPU consumption of the above mentioned function a little bit. Dont think its going to improve actual throughput of the workload substantially. 0.07% usemem [kernel.vmlinux] [k] mm_get_huge_zero_page to 0.01% usemem [kernel.vmlinux] [k] mm_get_huge_zero_page -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>