Re: x86: strange behavior of invlpg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14/05/2016 11:35, Nadav Amit wrote:
> I encountered a strange phenomenum and I would appreciate your sanity check
> and opinion. It looks as if 'invlpg' that runs in a VM causes a very broad
> flush.
> 
> I created a small kvm-unit-test (below) to show what I talk about. The test 
> touches 50 pages, and then either: (1) runs full flush, (2) runs invlpg to
> an arbitrary (other) address, or (3) runs memory barrier.
> 
> It appears that the execution time of the test is indeed determined by TLB
> misses, since the runtime of the memory barrier flavor is considerably lower.

Did you check the performance counters?  Another explanation is that 
there are no TLB misses, but CR3 writes are optimized in such a way 
that they do not incur TLB misses either.  (Disclaimer: I didn't check 
the performance counters to prove the alternative theory ;)).

> What I find strange is that if I compute the net access time for tests 1 & 2,
> by deducing the time of the flushes, the time is almost identical. I am aware 
> that invlpg flushes the page-walk caches, but I would still expect the invlpg
> flavor to run considerably faster than the full-flush flavor.

That's interesting.  I guess you're using EPT because I get very 
similar number on an Ivy Bridge laptop:

  with invlpg:        902,224,568
  with full flush:    880,103,513
  invlpg only         113,186,461
  full flushes only   100,236,620
  access net          104,454,125
  w/full flush net    779,866,893
  w/invlpg net        789,038,107

(commas added for readability).

Out of curiosity I tried making all pages global (patch after my
signature).  Both invlpg and write to CR3 become much faster, but
invlpg now is faster than full flush, even though in theory it
should be the opposite...

  with invlpg:        223,079,661
  with full flush:    294,280,788
  invlpg only         126,236,334
  full flushes only   107,614,525
  access net           90,830,503
  w/full flush net    186,666,263
  w/invlpg net         96,843,327

Thanks for the interesting test!

Paolo

diff --git a/lib/x86/vm.c b/lib/x86/vm.c
index 7ce7bbc..3b9b81a 100644
--- a/lib/x86/vm.c
+++ b/lib/x86/vm.c
@@ -2,6 +2,7 @@
 #include "vm.h"
 #include "libcflat.h"
 
+#define PTE_GLOBAL      256
 #define PAGE_SIZE 4096ul
 #ifdef __x86_64__
 #define LARGE_PAGE_SIZE (512 * PAGE_SIZE)
@@ -106,14 +107,14 @@ unsigned long *install_large_page(unsigned long *cr3,
 				  void *virt)
 {
     return install_pte(cr3, 2, virt,
-		       phys | PTE_PRESENT | PTE_WRITE | PTE_USER | PTE_PSE, 0);
+		       phys | PTE_PRESENT | PTE_WRITE | PTE_USER | PTE_PSE | PTE_GLOBAL, 0);
 }
 
 unsigned long *install_page(unsigned long *cr3,
 			    unsigned long phys,
 			    void *virt)
 {
-    return install_pte(cr3, 1, virt, phys | PTE_PRESENT | PTE_WRITE | PTE_USER, 0);
+    return install_pte(cr3, 1, virt, phys | PTE_PRESENT | PTE_WRITE | PTE_USER | PTE_GLOBAL, 0);
 }
 
 

> Am I missing something?
> 
> 
> On my Haswell EP I get the following results:
> 
> with invlpg:        948965249
> with full flush:    1047927009
> invlpg only         127682028
> full flushes only   224055273
> access net          107691277	--> considerably lower than w/flushes
> w/full flush net    823871736
> w/invlpg net        821283221	--> almost identical to full-flush net
> 
> ---
> 
> 
> #include "libcflat.h"
> #include "fwcfg.h"
> #include "vm.h"
> #include "smp.h"
> 
> #define N_PAGES	(50)
> #define ITERATIONS (500000)
> volatile char buf[N_PAGES * PAGE_SIZE] __attribute__ ((aligned (PAGE_SIZE)));
> 
> int main(void)
> {
>     void *another_addr = (void*)0x50f9000;
>     int i, j;
>     unsigned long t_start, t_single, t_full, t_single_only, t_full_only,
> 		  t_access;
>     unsigned long cr3;
>     char v = 0;
> 
>     setup_vm();
> 
>     cr3 = read_cr3();
> 
>     t_start = rdtsc();
>     for (i = 0; i < ITERATIONS; i++) {
>         invlpg(another_addr);
> 	for (j = 0; j < N_PAGES; j++)
>             v = buf[PAGE_SIZE * j];
>     }
>     t_single = rdtsc() - t_start;
>     printf("with invlpg:        %lu\n", t_single);
> 
>     t_start = rdtsc();
>     for (i = 0; i < ITERATIONS; i++) {
>     	write_cr3(cr3);
> 	for (j = 0; j < N_PAGES; j++)
>             v = buf[PAGE_SIZE * j];
>     }
>     t_full = rdtsc() - t_start;
>     printf("with full flush:    %lu\n", t_full);
> 
>     t_start = rdtsc();
>     for (i = 0; i < ITERATIONS; i++)
>          invlpg(another_addr);
>     t_single_only = rdtsc() - t_start;
>     printf("invlpg only         %lu\n", t_single_only);
> 
>     t_start = rdtsc();
>     for (i = 0; i < ITERATIONS; i++)
>     	 write_cr3(cr3);
>     t_full_only = rdtsc() - t_start;
>     printf("full flushes only   %lu\n", t_full_only);
> 
>     t_start = rdtsc();
>     for (i = 0; i < ITERATIONS; i++) {
> 	for (j = 0; j < N_PAGES; j++)
>             v = buf[PAGE_SIZE * j];
> 	mb();
>     }
>     t_access = rdtsc()-t_start;
>     printf("access net          %lu\n", t_access);
>     printf("w/full flush net    %lu\n", t_full - t_full_only);
>     printf("w/invlpg net        %lu\n", t_single - t_single_only);
> 
>     (void)v;
>     return 0;
> }--
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux