On 09/04/2009 05:48 PM, Andrew Theurer wrote:
Still not idle=poll, it may shave off 0.2%.
Won't this affect SMT in a negative way? (OK, I am not running SMT now,
but eventually we will be) A long time ago, we tested P4's with HT, and
a polling idle in one thread always negatively impacted performance in
the sibling thread.
Sorry, I meant idle=halt. idle=poll is too wasteful to be used.
FWIW, I did try idle=halt, and it was slightly worse.
Interesting, I've heard that mwait latency is bad for spinlocks, guess
it's fine for idle.
profile1 is qemu-kvm-87
profile2 is qemu-master
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 10000000
total samples (ts1) for profile1 is 1616921
total samples (ts2) for profile2 is 1752347 (includes multiplier of 0.995420)
functions which have a abs(pct2-pct1)< 0.06 are not displayed
pct2: pct1:
100* 100* pct2
s1 s2 s2/s1 s2/ts1 s1/ts1 -pct1 symbol bin
--------- --------- ------- ------- ------- ------ ------ ---
879611 907883 1.03/1 56.149 54.400 1.749 vmx_vcpu_run kvm
614 11553 18.82/1 0.715 0.038 0.677 gfn_to_memslot_unali kvm.ko
34511 44922 1.30/1 2.778 2.134 0.644 phys_page_find_alloc qemu
2866 9334 3.26/1 0.577 0.177 0.400 paging64_walk_addr kvm.ko
11139 17200 1.54/1 1.064 0.689 0.375 copy_user_generic_st vmlinux
3100 7108 2.29/1 0.440 0.192 0.248 x86_decode_insn kvm.ko
8169 11873 1.45/1 0.734 0.505 0.229 virtqueue_avail_byte qemu
1103 4540 4.12/1 0.281 0.068 0.213 kvm_read_guest kvm.ko
17427 20401 1.17/1 1.262 1.078 0.184 memcpy libc
0 2905 0.180 0.000 0.180 gfn_to_pfn kvm.ko
1831 4328 2.36/1 0.268 0.113 0.154 x86_emulate_insn kvm.ko
65 2431 37.41/1 0.150 0.004 0.146 emulator_read_emulat kvm.ko
14922 17196 1.15/1 1.064 0.923 0.141 qemu_get_ram_ptr qemu
545 2724 5.00/1 0.168 0.034 0.135 emulate_instruction kvm.ko
599 2464 4.11/1 0.152 0.037 0.115 kvm_read_guest_page kvm.ko
503 2355 4.68/1 0.146 0.031 0.115 gfn_to_hva kvm.ko
1076 2918 2.71/1 0.181 0.067 0.114 memcpy_c vmlinux
594 2241 3.77/1 0.139 0.037 0.102 next_segment kvm.ko
1680 3248 1.93/1 0.201 0.104 0.097 pipe_poll vmlinux
0 1463 0.090 0.000 0.090 subpage_readl qemu
0 1363 0.084 0.000 0.084 msix_enabled qemu
527 1883 3.57/1 0.116 0.033 0.084 paging64_gpte_to_gfn kvm.ko
962 2223 2.31/1 0.138 0.059 0.078 do_insn_fetch kvm.ko
348 1605 4.61/1 0.099 0.022 0.078 is_rsvd_bits_set kvm.ko
520 1763 3.39/1 0.109 0.032 0.077 unalias_gfn kvm.ko
1 1163 1163.65/1 0.072 0.000 0.072 tdp_page_fault kvm.ko
3827 4912 1.28/1 0.304 0.237 0.067 __down_read vmlinux
0 1014 0.063 0.000 0.063 mapping_level kvm.ko
973 0 0.000 0.060 -0.060 pm_ioport_readl qemu
1635 528 1/3.09 0.033 0.101 -0.068 ioport_read qemu
2179 1017 1/2.14 0.063 0.135 -0.072 kvm_emulate_pio kvm.ko
25141 23722 1/1.06 1.467 1.555 -0.088 native_write_msr_saf vmlinux
1560 0 0.000 0.096 -0.096 eventfd_poll vmlinux
------- ------- ------
105.100 97.450 7.650
18x more samples for gfn_to_memslot_unali*, 37x for
emulator_read_emula*, and more CPU time in guest mode.
And 5x more instructions emulated. I wonder where that comes from.
One other thing: So far I have not been using preadv/pwritev. I assume
I need a more recent glibc (on 2.5 now) for qemu to take advantage of
this?
Yes, but it should be easy to write a LD_PRELOAD hack that will work
with your current glibc. It should certainly improve things.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html