Re: [PATCH 0/3] x86: Make 5-level paging support unconditional for x86-64

"Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> · Wed, 31 Jul 2024 14:36:47 +0300

On Wed, Jul 31, 2024 at 11:15:05AM +0200, Thomas Gleixner wrote:
> On Wed, Jul 31 2024 at 14:27, Shivank Garg wrote:
> > lmbench:lat_pagefault: Metric- page-fault time (us) - Lower is better
> >                 4-Level PT              5-Level PT		% Change
> > THP-never       Mean:0.4068             Mean:0.4294		5.56
> >                 95% CI:0.4057-0.4078    95% CI:0.4287-0.4302
> >
> > THP-Always      Mean: 0.4061            Mean: 0.4288		% Change
> >                 95% CI: 0.4051-0.4071   95% CI: 0.4281-0.4295	5.59
> >
> > Inference:
> > 5-level page table shows increase in page-fault latency but it does
> > not significantly impact other benchmarks.
> 
> 5% regression on lmbench is a NONO.

Yeah, that's a biggy.

In our testing (on Intel HW) we didn't see any significant difference
between 4- and 5-level paging. But we were focused on TLB fill latency.
In both bare metal and in VMs. Maybe something wrong in the fault path?

It requires a closer look.

Shivank, could you share how you run lat_pagefault? What file size? How
parallel you run it?...

It would also be nice to get perf traces. Maybe it is purely SW issue.

> 5-level page tables add a cost in every hardware page table walk. That's
> a matter of fact and there is absolutely no reason to inflict this cost
> on everyone.
>
> The solution to this to make the 5-level mechanics smarter by evaluating
> whether the machine has enough memory to require 5-level tables and
> select the depth at boot time.

Let's understand the reason first.

The risk with your proposal is that 5-level paging will not get any
testing and rot over time.

I would like to keep it on, if possible.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov