On Tue, Apr 16, 2024 at 8:13 AM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: > > * jeffxu@xxxxxxxxxxxx <jeffxu@xxxxxxxxxxxx> [240415 12:35]: > > From: Jeff Xu <jeffxu@xxxxxxxxxxxx> > > > > This is V10 version, it rebases v9 patch to 6.9.rc3. > > We also applied and tested mseal() in chrome and chromebook. > > > > ------------------------------------------------------------------ > ... > > > MM perf benchmarks > > ================== > > This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to > > check the VMAs’ sealing flag, so that no partial update can be made, > > when any segment within the given memory range is sealed. > > > > To measure the performance impact of this loop, two tests are developed. > > [8] > > > > The first is measuring the time taken for a particular system call, > > by using clock_gettime(CLOCK_MONOTONIC). The second is using > > PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have > > similar results. > > > > The tests have roughly below sequence: > > for (i = 0; i < 1000, i++) > > create 1000 mappings (1 page per VMA) > > start the sampling > > for (j = 0; j < 1000, j++) > > mprotect one mapping > > stop and save the sample > > delete 1000 mappings > > calculates all samples. > > > Thank you for doing this performance testing. > > > > > Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz, > > 4G memory, Chromebook. > > > > Based on the latest upstream code: > > The first test (measuring time) > > syscall__ vmas t t_mseal delta_ns per_vma % > > munmap__ 1 909 944 35 35 104% > > munmap__ 2 1398 1502 104 52 107% > > munmap__ 4 2444 2594 149 37 106% > > munmap__ 8 4029 4323 293 37 107% > > munmap__ 16 6647 6935 288 18 104% > > munmap__ 32 11811 12398 587 18 105% > > mprotect 1 439 465 26 26 106% > > mprotect 2 1659 1745 86 43 105% > > mprotect 4 3747 3889 142 36 104% > > mprotect 8 6755 6969 215 27 103% > > mprotect 16 13748 14144 396 25 103% > > mprotect 32 27827 28969 1142 36 104% > > madvise_ 1 240 262 22 22 109% > > madvise_ 2 366 442 76 38 121% > > madvise_ 4 623 751 128 32 121% > > madvise_ 8 1110 1324 215 27 119% > > madvise_ 16 2127 2451 324 20 115% > > madvise_ 32 4109 4642 534 17 113% > > > > The second test (measuring cpu cycle) > > syscall__ vmas cpu cmseal delta_cpu per_vma % > > munmap__ 1 1790 1890 100 100 106% > > munmap__ 2 2819 3033 214 107 108% > > munmap__ 4 4959 5271 312 78 106% > > munmap__ 8 8262 8745 483 60 106% > > munmap__ 16 13099 14116 1017 64 108% > > munmap__ 32 23221 24785 1565 49 107% > > mprotect 1 906 967 62 62 107% > > mprotect 2 3019 3203 184 92 106% > > mprotect 4 6149 6569 420 105 107% > > mprotect 8 9978 10524 545 68 105% > > mprotect 16 20448 21427 979 61 105% > > mprotect 32 40972 42935 1963 61 105% > > madvise_ 1 434 497 63 63 115% > > madvise_ 2 752 899 147 74 120% > > madvise_ 4 1313 1513 200 50 115% > > madvise_ 8 2271 2627 356 44 116% > > madvise_ 16 4312 4883 571 36 113% > > madvise_ 32 8376 9319 943 29 111% > > > > If I am reading this right, madvise() is affected more than the other > calls? Is that expected or do we need to have a closer look? > The madvise() has a bigger percentage (per_vma %), but it also has a smaller base value (cpu). -Jeff