The 04/15/2020 11:27, Huang, Ying wrote: > > Can you describe your test? > We profile the clear_huge_page() using ftrace while parallely force triggering it by a simple userspace test code which allocates 100MB of anon memory and traverses through it in loop. > > You have tested the chunk sizes 4KB and 2MB, can you test some values in > between? For example 32KB or 64KB? Maybe there's a sweet point with > some smaller granularity and good performance. Based on your advise I tried chunk sizes of 4KB, 8KB, 16KB, 32KB and 64KB on arm64 and x86_64 by copying the kernel memset implementation for both the archs. ------------------------------------------------------------------------------- Results(the sample size is 100 for each and the values are in us):- ------------------------------------------------------------------------------- ARM64(CPU0 & 6 on and set at max frequency, DDR set to performance governor):- ------------------------------------------------------------------------------- Chunk Size = 4KB ----------------- Oneshot Mean : 3402.06 Stddev : 72.6576 Forward Mean : 3408.04 Stddev : 72.976 Reverse Mean : 17699.3 Stddev : 132.875 ----------------- Chunk Size = 8KB ----------------- Oneshot Mean : 3398.64 Stddev : 80.6334 Forward Mean : 3391.58 Stddev : 65.9063 Reverse Mean : 13909.2 Stddev : 194.324 ----------------- Chunk Size = 16KB ----------------- Oneshot Mean : 3393.57 Stddev : 72.2485 Forward Mean : 3404.69 Stddev : 84.4705 Reverse Mean : 9278.65 Stddev : 217.725 ----------------- Chunk Size = 32KB ----------------- Oneshot Mean : 3425.7 Stddev : 129.156 Forward Mean : 3402.07 Stddev : 82.6713 Reverse Mean : 6831.43 Stddev : 184.807 ----------------- Chunk Size = 64KB ----------------- Oneshot Mean : 3398.72 Stddev : 77.9703 Forward Mean : 3413.52 Stddev : 173.121 Reverse Mean : 5542.84 Stddev : 197.017 --------------------------------------------- x86_64(Only CPU0 on and set to max frequency) --------------------------------------------- Chunk Size = 4KB ----------------- Oneshot Mean : 6752.59 Stddev : 298.988 Forward Mean : 6873.6 Stddev : 325.607 Reverse Mean : 6722.88 Stddev : 365.837 ----------------- Chunk Size = 8KB ----------------- Oneshot Mean : 6848.57 Stddev : 955.312 Forward Mean : 7012.24 Stddev : 1377.27 Reverse Mean : 6688.83 Stddev : 589.935 ----------------- Chunk Size = 16KB ----------------- Oneshot Mean : 6846.87 Stddev : 546.173 Forward Mean : 6785.26 Stddev : 248.022 Reverse Mean : 6613.33 Stddev : 350.003 ----------------- Chunk Size = 32KB ----------------- Oneshot Mean : 6862.19 Stddev : 870.524 Forward Mean : 6826.3 Stddev : 870.023 Reverse Mean : 6747.69 Stddev : 1047.5 ----------------- Chunk Size = 64KB ----------------- Oneshot Mean : 6806.9 Stddev : 609.112 Forward Mean : 6774.53 Stddev : 311.954 Reverse Mean : 6553.47 Stddev : 293.52 -- Prathu Baronia OnePlus RnD