Re: [PATCH 0/3] x86: Make 5-level paging support unconditional for x86-64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I did some experiments to understand the impact of making 5 level page tables
default.
Machine Info:  AMD Zen 4 EPYC server (2-socket system, 128 cores and 1 NUMA
node per socket, SMT Enabled)
Size of each NUMA node is approx 377 GB.

For experiments, I'm binding the benchmark to CPUs and memory nodes of single
socket for consistent results. Measured by enabling/disabling 5level Page
table using CONFIG_X86_5LEVEL.

% Change: (5L-4L)/4L*100
CoV (%):  Coefficient of Variation (%)

Results:

lmbench:lat_pagefault: Metric- page-fault time (us) - Lower is better
                4-Level PT              5-Level PT		% Change
THP-never       Mean:0.4068             Mean:0.4294		5.56
                95% CI:0.4057-0.4078    95% CI:0.4287-0.4302

THP-Always      Mean: 0.4061            Mean: 0.4288		% Change
                95% CI: 0.4051-0.4071   95% CI: 0.4281-0.4295	5.59


Btree (Thread:32): Metric- Time Taken (in seconds) - Lower is better 
                4-Level                 5-Level               
                Time Taken(s) CoV (%)   Time Taken(s) CoV(%)    % Change
THP Never       382.2         0.219     388.8         1.019     1.73
THP Madvise     383.0         0.261     384.8         0.809     0.47
THP Always      392.8         1.376     386.4         2.147     -1.63

Btree (Thread:256): Metric- Time Taken (in seconds) - Lower is better
                4-Level                 5-Level               
                Time Taken(s) CoV (%)   Time Taken(s) CoV(%)     % Change
THP Never       56.6          2.014     55.2          0.810     -2.47
THP Madvise     56.6          2.014     56.4          2.022     -0.35
THP Always      56.6          0.968     56.2          1.489     -0.71


Ebizzy: Metric- records/s - Higher is better
                4-Level                 5-Level
Threads         record/s    CoV (%)     record/s    CoV(%)      % Change
1               844         0.302       837         0.196       -0.85
256             10160       0.315       10288       1.081       1.26


XSBench (Thread:256, THP:Never) - Higher is better
Metric          4-Level         5-Level         % Change
Lookups/s       13720556        13396288        -2.36
CoV (%)         1.726           1.317


Hashjoin (Thread:256, THP:Never) - Lower is better
Metric          4-Level         5-Level         % Change
Time taken(s)   424.4           427.4           0.707
CoV (%)         0.394           0.209


Graph500(Thread:256, THP:Madvise) - Lower is better
Metric          4-Level         5-Level       % Change
Time Taken(s)   0.1879          0.1873        -0.32
CoV (%)         0.165           0.213


GUPS(Thread:128, THP:Madvise) - Higher is better
Metric          4-Level         5-Level       % Change
GUPS            1.3265          1.3252        -0.10
CoV (%)         0.037           0.027


pagerank(Thread:256, THP:Madvise) - Lower is better
Metric          4-Level         5-Level       % Change
Time taken(s)   143.67          143.67        0.00
CoV (%)         0.402           0.402


Redis(Thread:256, THP:Madvise) - Higher is better
Metric              4-Level     5-Level       % Change
Throughput(Ops/s)   141030744   139586376     -1.02
CoV (%)             0.372       0.561


memcached(Thread:256, THP:Madvise) - Higher is better
Metric              4-Level     5-Level       % Change
Throughput(Ops/s)   19916313    19743637      -0.87
CoV (%)             0.051       0.095


Inference:
5-level page table shows increase in page-fault latency but it does
not significantly impact other benchmarks.


Thanks,
Shivank




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux