2012/3/18 Gleb Natapov <gleb@xxxxxxxxxx>: > On Sun, Mar 18, 2012 at 06:27:53PM +0530, shashank rachamalla wrote: >> 2012/3/18 Gleb Natapov <gleb@xxxxxxxxxx>: >> > On Sun, Mar 18, 2012 at 03:48:53PM +0530, shashank rachamalla wrote: >> >> >> >> CPU: Core 2, speed 2200.19 MHz (estimated) >> >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a >> >> unit mask of 0x00 (Unhalted core cycles) count 100000 >> >> CPU_CLK_UNHALT...| >> >> samples| %| >> >> ------------------ >> >> 1 100.000 /no-vmlinux >> >> >> >> >> > What happens if you run "perf stat -e cycles a.out"? >> >> This is what i get when I run "perf stat ./a.out" ( with --cpu host >> passed in qemu-kvm ) >> >> Performance counter stats for './a.out': >> >> 954.925996 task-clock-msecs # 0.955 CPUs >> 23 context-switches # 0.000 M/sec >> 0 CPU-migrations # 0.000 M/sec >> 93 page-faults # 0.000 M/sec >> 2110635735 cycles # 2210.261 M/sec >> 901816391 instructions # 0.427 IPC >> 247677 cache-references # 0.259 M/sec >> 15044 cache-misses # 0.016 M/sec >> >> 0.999846560 seconds time elapsed >> >> This is without --cpu host ( Note that only software events get monitored ) >> >> Performance counter stats for './a.out': >> >> 913.826372 task-clock-msecs # 0.990 CPUs >> 13 context-switches # 0.000 M/sec >> 0 CPU-migrations # 0.000 M/sec >> 93 page-faults # 0.000 M/sec >> <not counted> cycles >> <not counted> instructions >> <not counted> cache-references >> <not counted> cache-misses >> >> 0.923182044 seconds time elapsed >> >> I guess things are working fine with perf. But why not with oprofile ? >> > Looks like it. I never tried oprofile. Will try to reproduce your > problem and see what oprofile is doing. I am using ubuntu 10.04 with 2.6.32-21-generic kernel as guest and oprofile 0.9.6. Also, I have tried to capture kvm-events ( perf patch ) in host while running oprofile and perf in guest. Please see the attachment. I have run the tests in three cases for the around 5 secs. There are more number of MSR reads and writes in case of perf which I think is normal. However, there are very few MSR reads and writes with oprofile. Also, the number of NMI exceptions are too high in case of oprofile. > > -- > Gleb.
# Normal Case ( without any profiling in guest ) Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Avg time IO_INSTRUCTION 1953 69.03% 2.48% 63.20us ( +- 49.77% ) APIC_ACCESS 343 12.12% 0.07% 10.11us ( +- 4.86% ) EXCEPTION_NMI 289 10.22% 0.03% 5.03us ( +- 2.51% ) PENDING_INTERRUPT 81 2.86% 0.00% 1.69us ( +- 1.19% ) CR_ACCESS 77 2.72% 0.01% 6.04us ( +- 6.86% ) HLT 72 2.55% 97.41% 67426.33us ( +- 9.26% ) EXTERNAL_INTERRUPT 14 0.49% 0.01% 19.50us ( +- 22.36% ) # With perf in guest Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Avg time EXCEPTION_NMI 6906 26.93% 0.62% 4.16us ( +- 1.56% ) MSR_WRITE 5894 22.99% 0.61% 4.83us ( +- 3.51% ) IO_INSTRUCTION 3981 15.53% 2.81% 32.76us ( +- 21.95% ) MSR_READ 2702 10.54% 0.07% 1.14us ( +- 0.38% ) APIC_ACCESS 2370 9.24% 0.30% 5.78us ( +- 1.81% ) INVLPG 2058 8.03% 0.10% 2.17us ( +- 4.98% ) EXTERNAL_INTERRUPT 830 3.24% 0.35% 19.29us ( +- 25.56% ) CR_ACCESS 418 1.63% 0.05% 5.63us ( +- 9.14% ) PENDING_INTERRUPT 372 1.45% 0.01% 1.67us ( +- 0.47% ) HLT 93 0.36% 95.08% 47371.95us ( +- 10.55% ) CPUID 18 0.07% 0.00% 1.28us ( +- 6.54% ) # With oprofile in guest Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Avg time EXCEPTION_NMI 161092 60.98% 12.48% 3.15us ( +- 0.56% ) INVLPG 94184 35.65% 4.30% 1.85us ( +- 1.08% ) IO_INSTRUCTION 2674 1.01% 2.39% 36.35us ( +- 31.43% ) CR_ACCESS 2658 1.01% 1.05% 16.00us ( +- 3.06% ) APIC_ACCESS 1739 0.66% 0.29% 6.80us ( +- 2.24% ) EXTERNAL_INTERRUPT 1044 0.40% 0.55% 21.58us ( +- 27.84% ) CPUID 539 0.20% 0.02% 1.28us ( +- 2.73% ) PENDING_INTERRUPT 159 0.06% 0.01% 1.73us ( +- 1.09% ) HLT 73 0.03% 78.92% 43963.08us ( +- 12.33% ) MSR_WRITE 7 0.00% 0.00% 23.73us ( +- 69.57% ) MSR_READ 3 0.00% 0.00% 2.74us ( +- 8.44% )