Re: x86: strange behavior of invlpg

Nadav Amit <nadav.amit@xxxxxxxxx> · Thu, 15 Feb 2018 14:43:23 -0800

Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:

> On 16/05/2016 18:51, Nadav Amit wrote:
>> Thanks! I appreciate it.
>> 
>> I think your experiment with global paging just corraborate that the
>> latency is caused by TLB misses. I measured TLB misses (and especially STLB
>> misses) in other experiments but not in this one. I will run some more
>> experiments, specifically to test how AMD behaves.
> 
> I'm curious about AMD too now...
> 
>  with invlpg:        285,639,427
>  with full flush:    584,419,299
>  invlpg only          70,681,128
>  full flushes only   265,238,766
>  access net          242,538,804
>  w/full flush net    319,180,533
>  w/invlpg net        214,958,299
> 
> Roughly the same with and without pte.g.  So AMD behaves as it should.
> 
>> I should note this is a byproduct of a study I did, and it is not as if I was
>> looking for strange behaviors (no more validation papers for me!).
>> 
>> The strangest thing is that on bare-metal I don’t see this phenomenon - I doubt
>> it is a CPU “feature”. Once we understand it, the very least it may affect
>> the recommended value of “tlb_single_page_flush_ceiling”, that controls when
>> the kernel performs full TLB flush vs. selective flushes.
> 
> Do you have a kernel module to reproduce the test on bare metal? (/me is
> lazy).

It came to my mind that I didn’t tell you what turned eventually to be the
issue. (Yes, I know it is a very old thread, but you may still be
interested).

It turns out that Intel has something that is called “page fracturing”.
After the TLB caches a translation that came from 2MB guest page and 4KB
host page, INVLPG ends up flushing the entire TLB is flushed.

I guess they need to do it to follow the SDM 4.10.4.1 (regarding pages
larger than 4 KBytes): "The INVLPG instruction and page faults provide the
same assurances that they provide when a single TLB entry is used: they
invalidate all TLB entries corresponding to the translation specified by the
paging structures.”

Thanks again for your help,
Nadav