I encountered a strange phenomenum and I would appreciate your sanity check and opinion. It looks as if 'invlpg' that runs in a VM causes a very broad flush. I created a small kvm-unit-test (below) to show what I talk about. The test touches 50 pages, and then either: (1) runs full flush, (2) runs invlpg to an arbitrary (other) address, or (3) runs memory barrier. It appears that the execution time of the test is indeed determined by TLB misses, since the runtime of the memory barrier flavor is considerably lower. What I find strange is that if I compute the net access time for tests 1 & 2, by deducing the time of the flushes, the time is almost identical. I am aware that invlpg flushes the page-walk caches, but I would still expect the invlpg flavor to run considerably faster than the full-flush flavor. Am I missing something? On my Haswell EP I get the following results: with invlpg: 948965249 with full flush: 1047927009 invlpg only 127682028 full flushes only 224055273 access net 107691277 --> considerably lower than w/flushes w/full flush net 823871736 w/invlpg net 821283221 --> almost identical to full-flush net --- #include "libcflat.h" #include "fwcfg.h" #include "vm.h" #include "smp.h" #define N_PAGES (50) #define ITERATIONS (500000) volatile char buf[N_PAGES * PAGE_SIZE] __attribute__ ((aligned (PAGE_SIZE))); int main(void) { void *another_addr = (void*)0x50f9000; int i, j; unsigned long t_start, t_single, t_full, t_single_only, t_full_only, t_access; unsigned long cr3; char v = 0; setup_vm(); cr3 = read_cr3(); t_start = rdtsc(); for (i = 0; i < ITERATIONS; i++) { invlpg(another_addr); for (j = 0; j < N_PAGES; j++) v = buf[PAGE_SIZE * j]; } t_single = rdtsc() - t_start; printf("with invlpg: %lu\n", t_single); t_start = rdtsc(); for (i = 0; i < ITERATIONS; i++) { write_cr3(cr3); for (j = 0; j < N_PAGES; j++) v = buf[PAGE_SIZE * j]; } t_full = rdtsc() - t_start; printf("with full flush: %lu\n", t_full); t_start = rdtsc(); for (i = 0; i < ITERATIONS; i++) invlpg(another_addr); t_single_only = rdtsc() - t_start; printf("invlpg only %lu\n", t_single_only); t_start = rdtsc(); for (i = 0; i < ITERATIONS; i++) write_cr3(cr3); t_full_only = rdtsc() - t_start; printf("full flushes only %lu\n", t_full_only); t_start = rdtsc(); for (i = 0; i < ITERATIONS; i++) { for (j = 0; j < N_PAGES; j++) v = buf[PAGE_SIZE * j]; mb(); } t_access = rdtsc()-t_start; printf("access net %lu\n", t_access); printf("w/full flush net %lu\n", t_full - t_full_only); printf("w/invlpg net %lu\n", t_single - t_single_only); (void)v; return 0; }-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html