Re: [PATCH v2] mm: Optimized hugepage zeroing & copying from user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The 04/15/2020 11:27, Huang, Ying wrote:
> 
> Can you describe your test?
> 
We profile the clear_huge_page() using ftrace while parallely force triggering it by a simple
userspace test code which allocates 100MB of anon memory and traverses through
it in loop.
> 
> You have tested the chunk sizes 4KB and 2MB, can you test some values in
> between?  For example 32KB or 64KB?  Maybe there's a sweet point with
> some smaller granularity and good performance.
Based on your advise I tried chunk sizes of 4KB, 8KB, 16KB, 32KB and 64KB on
arm64 and x86_64 by copying the kernel memset implementation for both the archs.
-------------------------------------------------------------------------------
Results(the sample size is 100 for each and the values are in us):-
-------------------------------------------------------------------------------
ARM64(CPU0 & 6 on and set at max frequency, DDR set to performance governor):-
-------------------------------------------------------------------------------
Chunk Size = 4KB
-----------------
Oneshot
	Mean : 3402.06
	Stddev : 72.6576
Forward
	Mean : 3408.04
	Stddev : 72.976
Reverse
	Mean : 17699.3
	Stddev : 132.875
-----------------
Chunk Size = 8KB
-----------------
Oneshot
	Mean : 3398.64
	Stddev : 80.6334
Forward
	Mean : 3391.58
	Stddev : 65.9063
Reverse
	Mean : 13909.2
	Stddev : 194.324
-----------------
Chunk Size = 16KB
-----------------
Oneshot
	Mean : 3393.57
	Stddev : 72.2485
Forward
	Mean : 3404.69
	Stddev : 84.4705
Reverse
	Mean : 9278.65
	Stddev : 217.725
-----------------
Chunk Size = 32KB
-----------------
Oneshot
	Mean : 3425.7
	Stddev : 129.156
Forward
	Mean : 3402.07
	Stddev : 82.6713
Reverse
	Mean : 6831.43
	Stddev : 184.807
-----------------
Chunk Size = 64KB
-----------------
Oneshot
	Mean : 3398.72
	Stddev : 77.9703
Forward
	Mean : 3413.52
	Stddev : 173.121
Reverse
	Mean : 5542.84
	Stddev : 197.017
---------------------------------------------
x86_64(Only CPU0 on and set to max frequency)
---------------------------------------------
Chunk Size = 4KB
-----------------
Oneshot
	Mean : 6752.59
	Stddev : 298.988
Forward
	Mean : 6873.6
	Stddev : 325.607
Reverse
	Mean : 6722.88
	Stddev : 365.837
-----------------
Chunk Size = 8KB
-----------------
Oneshot
	Mean : 6848.57
	Stddev : 955.312
Forward
	Mean : 7012.24
	Stddev : 1377.27
Reverse
	Mean : 6688.83
	Stddev : 589.935
-----------------
Chunk Size = 16KB
-----------------
Oneshot
	Mean : 6846.87
	Stddev : 546.173
Forward
	Mean : 6785.26
	Stddev : 248.022
Reverse
	Mean : 6613.33
	Stddev : 350.003
-----------------
Chunk Size = 32KB
-----------------
Oneshot
	Mean : 6862.19
	Stddev : 870.524
Forward
	Mean : 6826.3
	Stddev : 870.023
Reverse
	Mean : 6747.69
	Stddev : 1047.5
-----------------
Chunk Size = 64KB
-----------------
Oneshot
	Mean : 6806.9
	Stddev : 609.112
Forward
	Mean : 6774.53
	Stddev : 311.954
Reverse
	Mean : 6553.47
	Stddev : 293.52
-- 
Prathu Baronia
OnePlus RnD




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux