Argh... I don’t get the same behavior in the guest with the module test. I’ll need some more time to figure it out. Just a small comment regarding your “global” test: you forgot to set CR4.PGE. Once I set it, I get reasonable numbers (excluding the invlpg flavor). with invlpg: 964431529 with full flush: 268190767 invlpg only 126114041 full flushes only 185971818 access net 111229828 w/full flush net 82218949 —> similar to access net w/invlpg net 838317488 I’ll be back when I have more understanding of the situation. Thanks, Nadav Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > On 16/05/2016 18:51, Nadav Amit wrote: >> Thanks! I appreciate it. >> >> I think your experiment with global paging just corraborate that the >> latency is caused by TLB misses. I measured TLB misses (and especially STLB >> misses) in other experiments but not in this one. I will run some more >> experiments, specifically to test how AMD behaves. > > I'm curious about AMD too now... > > with invlpg: 285,639,427 > with full flush: 584,419,299 > invlpg only 70,681,128 > full flushes only 265,238,766 > access net 242,538,804 > w/full flush net 319,180,533 > w/invlpg net 214,958,299 > > Roughly the same with and without pte.g. So AMD behaves as it should. > >> I should note this is a byproduct of a study I did, and it is not as if I was >> looking for strange behaviors (no more validation papers for me!). >> >> The strangest thing is that on bare-metal I don’t see this phenomenon - I doubt >> it is a CPU “feature”. Once we understand it, the very least it may affect >> the recommended value of “tlb_single_page_flush_ceiling”, that controls when >> the kernel performs full TLB flush vs. selective flushes. > > Do you have a kernel module to reproduce the test on bare metal? (/me is > lazy). > > Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html