Il 03/02/2014 18:06, Xin Tong ha scritto:
/.../qemu-system-x86_64 TID 2537 [TID 2537] (877 ticks/71.24%)
This is the CPU thread (calls into the KVM modules).
/.../vmlinux (395 ticks/45.04%)
/kvm (198 ticks/22.58%)
/kvm_intel (153 ticks/17.45%)
/.../qemu-system-x86_64 (71 ticks/8.10%)
/.../libc-2.15.so (47 ticks/5.36%)
/.../libpthread-2.15.so (12 ticks/1.37%)
/libahci (1 ticks/0.11%)
/.../qemu-system-x86_64 TID 2658 [TID 2658] (200 ticks/16.25%)
This is probably the VNC thread.
/.../vmlinux (190 ticks/95.00%)
/.../libpthread-2.15.so (6 ticks/3.00%)
/.../libc-2.15.so (2 ticks/1.00%)
/.../qemu-system-x86_64 (1 ticks/0.50%)
[vdso] (tgid:2534 range:0x7fff10588000-0x7fff10589fff) (1 ticks/0.50%)
/.../qemu-system-x86_64 TID 2534 [TID 2534] (154 ticks/12.51%)
This is the main thread (lowest TID of all).
/ksmd [PID 53] (83 ticks/5.36%)
/.../top [PID 2617] (43 ticks/2.78%)
/.../unity-2d-shell [PID 1807] (41 ticks/2.65%)
Now, going to your questions:
2. why is qemu-system-x86_64 only taking 8.10% of the time, i imagine
most of the time should be spent in qemu-system-x86_64 as 403.GCC does
not do too much IO.
This is the same I already answered: the time spent running the guest is
in the context of process qemu-system-x86_64, but it is not running code
from the qemu executable. In fact, 8.10% spent in qemu-system-x86_64 is
a lot. I would expect much less.
3. why are so much time spent in vmlinux ? the symbols for vmlinux is
listed below.
4. why are so much time spent in kvm-intel and kvm ? the symbols for
both are listed below.
Again, I think I already answered this. The time spent in vmx_vcpu_run
is the actual time spent in the guest (17.45*60.13% = 10% roughly).
Everything else is overhead introduced by either virtualization or
profiling.
What SPEC scores are you getting from bare-metal, KVM without oprofile
and KVM with oprofile? Is the profile substantially different if you
use perf instead of oprofile?
Can you run "scripts/kvm/kvm_stat" (from the QEMU tree) while your guest
is running (under oprofile) and paste the output from that tool?
Thanks,
Paolo
/.../vmlinux (395 ticks/45.04%):
CPU_CLK_UNHALTED % Symbol/Functions
82 20.76 native_write_msr_safe
46 11.65 native_read_msr_safe
25 6.33 __srcu_read_lock
16 4.05 native_load_tr_desc
12 3.04 __srcu_read_unlock
9 2.28 native_load_gdt
8 2.03 fget_light
8 2.03 _raw_spin_lock_irq
7 1.77 memcpy
6 1.52 _raw_spin_lock
6 1.52 __set_current_blocked
5 1.27 user_return_notifier_unregister
5 1.27 update_cfs_shares
5 1.27 page_fault
5 1.27 bsearch
/kvm (198 ticks/22.58%)
CPU_CLK_UNHALTED % Symbol/Functions
42 21.21 kvm_arch_vcpu_ioctl_run
9 4.55 kvm_set_shared_msr
7 3.54 x86_emulate_insn
7 3.54 paging64_walk_addr_generic
7 3.54 kvm_on_user_return
7 3.54 do_insn_fetch
6 3.03 gfn_to_memslot
6 3.03 decode_operand
5 2.53 kvm_io_bus_sort_cmp
5 2.53 kvm_arch_vcpu_load
5 2.53 kvm_apic_accept_pic_intr
4 2.02 kvm_fetch_guest_virt
4 2.02 kvm_arch_vcpu_runnable
4 2.02 emulator_read_gpr
4 2.02 apic_has_pending_timer
3 1.52 x86_decode_insn
/kvm_intel (153 ticks/17.45%)
CPU_CLK_UNHALTED % Symbol/Functions
92 60.13 vmx_vcpu_run
9 5.88 vmx_save_host_state
7 4.58 vmx_handle_external_intr
7 4.58 vmcs_writel
6 3.92 vmx_handle_exit
6 3.92 __vmx_load_host_state.part.43
4 2.61 clear_atomic_switch_msr
3 1.96 vmx_vm_has_apicv
3 1.96 vmx_get_rflags
2 1.31 vmx_read_guest_seg_selector
2 1.31 vmcs_clear_bits
2 1.31 skip_emulated_instruction
1 0.65 vmx_set_rflags
1 0.65 vmx_vcpu_load
Thank you,
Xin
On Mon, Feb 3, 2014 at 2:11 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
Il 02/02/2014 16:47, Xin Tong ha scritto:
On Sun, Feb 2, 2014 at 5:37 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
Il 02/02/2014 03:08, Xin Tong ha scritto:
I am getting very weird profile results by running operf on linux on
the host and profiling the a kvm virtual machine running dacapo
eclipse benchmark. I am expecting a lot of time should be spent in
the qemu-system-x86_64 as the instructions from the eclipse benchmark
would be treated as part of the qemu-system-x86_64 process, but the
results tell different. any suggestions ?
Most of the time should be spent running the guest. This is in the
context
of process qemu-system-x86_64, but it is not running code from the qemu
executable.
That is what i was trying to say. you said it better.
What likely happens is that when the profiling counter expires, it causes
the VM to exit before the profiling interrupt is delivered. The sample
should then be associated to the kvm_intel module.
could not the kvm module read the counters and inject an counter
overflow interrupt into the guest ? what are some ways to make the
profile more accurate. this is collected on a intel haswell host
machine.
Yes, it can. But then you have to run perf/operf in the guest not in the
host.
For this to work, you need to specify "-cpu host" on the QEMU command line.
Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html